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THE ORGANIZATION OF THE REPORT AND SUMMARY 

This study provides analytical tools, methods and techniques 
for assessing the design and performance of the Space Shuttle 
Orbiter data processing system (DPS). The computer data processing 
network is evaluated in the following three key areas: 

Queueing Behavior; 

Synchronization; 

Network Reliability. 

The report is divided into two main parts. Part I consists of 
detailed modeling and analyses of queueing and synchronization 
aspects of the DPS. Part II involves the evaluation of the overall 
network reliability in the presence of various failure modes. The 
detailed models, techniques, performance measures and results 
presented here fully satisfy all the study objectives outlined in the 
associated technical proposal. 

The structure of the data processing network is presented in 
Section I.l. System operation principles and the network configuration 
are described. The characteristics of the computer systems are 
indicated. 

Traffic, task and subsystem models and parameters are derived 
and described in Section 1.2. Process parameters and models are 
presented for the following network elements: the computer subsystem; 

the terminal, task and user traffic; task and application process 
parameters; and the communication subnetwork. 


The system performance measures are derived, presented and 
discussed in Section 1.3. We differentiate between computer 
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oriented performance measures, user oriented performance measures 
and system and network related performance indices. 

General important queueing models are described, analyzed 
and compared in Sections I. 4-1. 5. Computer system queueing 
models are presented in Section 1.6. Queueing modeling and analysis 
methods for the orbiter DPS are described in Section 1.7. 

Time-sharing queueing models are described and analyzed in 
Section 1.4. Included are: time-shared single processor systems; 

batch processing systems; round-robin processing; round-robin 
with priorities; a round-robin scheme with time- varying priorities; 
foreground-background processing shcemes; and multilevel processor 
sharing scheme. The performance characteristics of the various 
time-shared schemes are then compared. 

Priority queueing models are described and analyzed in 
Section 1.5. While time-sharing schemes increase the operational 
efficiency of the orbiter computer complex, priority service 
procedures allow the incorporation of task priorities in providing 
the proper grade-of-service for critical tasks. 

In Section 1.6, we present queueing models and demonstrate 
the performance analysis for the computer system. Operating 
systems and memory management techniques are discussed. Computer 
scheduling proceduresare outlined. The following analytical queueing 
models are then presented, for studying the queueing behavior of 
the computer system: a Markovian queueing model with finite buffer 

facility; a finite task source queueing model; a multi -processor 
queueing model; and qiieusing models involving input/output (I/O) 
and CPU interactions. 
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Queueing modeling and analysis procedures for the Space Shuttle 
orbiter avionics system are presented in Section 1.7. The underlying 
queueing model is described. A time frame model for the computer 
system is then chosen. Tasks are divided as being cyclic or acyclic. 
Proper computer task service times are subsequently allocated. 

Queueing models are then chosen and analyzed for cyclic and acyclic 
tasks. Subsequently, the results are integrated to yield a joint 
queueing model. The latter is analyzed, and the system performance 
functions are derived, studied and discussed. We then choose proper 
queueing models for describing message delay and buffer characteristics 
at the user terminals, considering both input and output traffic. 

The synchronization problem is discussed in Section 1.8. 
Synchronization considerations for the data processing system are 
outlined. A queueing model is presented to relate time offset 
parameters with message delay and buffer queue-size functions. Clock 
synchronization procedures are then presented, discussed, compared 
and analyzed. 

In Part II of the report, system reliability measures are defined 
and studied. System and network invulnerability measures are computed. 
A communication path and network failure analysis techniques are 
presented. The reliability features of the data processing network 
are outlined in Section II. 1. In Section II. 2 we define failure 
parameters and reliability performance measures for the computer 
complex. The failure analysis for the computer system, when operating 
in the simplex mode, is carried out in SectionII.3. The corresponding 
failure analysis for the redundant computer system is presented in 
Section II. 4. The invulnerability characteristics and failure 
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properties of an application subsystem are derived in SectionII.5. 
These results are integrated and combined in Section I I. 6, resulting 
with the failure analysis of the data processing network. 

The techniques, methods and results presented in this study 
are of prime importance as tools in assessing the performance of 
the orbiter DPS. Furthermore, the models developed and presented 
here rre of general fundamental nature, involving the key aspects 
of system reliability, queueing (delay-throughput, grade-of- 
service and system utilizaticn measures) and synchronization. 
Subsequently, they can be used in studying the performance of 
the system under a variety of operational conditions, 
including future modifications and expansion situations. 
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1.1 THE STRUCTURE OF THE DATA PROCESSING NETWORK 

1.1.1 System and Network Configuration and Operation 

The space Shuttle avionics system contains five general 
purpose computers (GPCs) communicating with the avionic sub- 
systems over serial data buses. A block diagram of the Space 
Shuttle Avionics system is shown in Fig. 1.1.1. Four of the 
five GPcs are identically programmed to perform flight-critical 
functions, such as guidance, navigation and control. The fifth 
computer is programmed to perform non-flight-critical avionic 
functions. A block diagram of the data processing and software 
subsystem is shown in Fig. 1.1.2. 

A GPC consists of an IBM AP-101 central processing unit 
(CPU) and an input/output (I/O) processor (lOP). Each lOP is 
transformer-coupled to the buses, and can transmit or receive at 
a rate of 1 MHz serial digital data over each of 24 bus channels. 
The data buses, on the other side, are transformer-coupled to 
multiplexer/demultiplexer units (MDMs) and digital subsystems. 

The MDMs contain analog-to-digital and digital-to-analog 
converters. They interface with analog subsystems, such as 
flight control sensors and effectors (see Fig. 1.1.2). 

Subsystems that perform similar functions are assigned to 
the same data-bus group. There are seven such groups (see 
Fig. 1.1.1). The subsystems have varying levels of redundancy 
at the unit level, depending on their criticality. Each unit 
is addressed by a command word over the bus. To prevent the 
loss of more than one redundant unit when one data bus fails, 
no two redundant units interface with the same bus. 
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During time-critical mission phases (i.e., recovery time 
less than one second), such as boost, reentry and landing, four 
of the five GPCs operate as a redundant set, receiving the same 
input data, performing the same flight-critical computations 
and transmitting the same output commands. In this mode of 
operation, efficient detection and identification of two flight- 
critical computer failures is provided by’ comparing the output 
commands and "voting" on the results. This involves the voting 
subsystem. After two failures, the remaining two computers in 
the set use comparison and self- test techniques to provide 
tolerance of a, third fault. The voting mechanism thus allows a 
computer to transmit incorrect commands to critical subsystems 
for an indefinite number of cycles without having adverse effects 
on system operation. 

The system operates as follows. Each bus within a data-bus 
group is assigned, under software control , to operate in either 
a coiwnand or a listen mode. In the command mode, data requests 
and commands are issued to the subsystems over the bus and data 
are received over the same bus. In the listen mode, data are 
only received on the bus. 

In the flight critiGal sensor and control -data-bus group 
(two subgroups of four buses), one bus in each subgroup is 
assigned to operate in the command mode (in each redundant-set 
computer) and the remaining three are assigned to operate in the 
listen mode. In the inter-computer channel (ICC) data-bus group, 
containing five buses, one bus (in each computer) is in the command 
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mode and the remaining four are in the listen mode. 

Data Collection . Each of the redundant subsystems is { 

connected to a different bus. Thus, a different computer requests 
data from each of the subsystems and the returned data are 
available to all other computers in the set. The listening 
computers are informed that the subsystem data are available 
by receiving a listen command, which is issued by the command 
computer just prior to issuing the data request command to the 
subsystem.’ In this way, identical input data are available to 
each computer in the redundant set. 

In noncritical phases of the mission, each of the GPCs is 
associated with a proper dedicated subset of subsystems. This 
non-redundant configuration is termed the simplex mode. 

Data Output . Consider the redundant mode. Each channel of 
the (voting) effector subsystem is connected to a different 
bus of the group. Thus, a different computer transmits command 
data to each of the voter-effector channels. Hence, a voter-effector 
subsystem requires four inputs which it receives from four different 
computers. Since buses are interconnected to all computers, each 
computer can listen to the coimiand data sent out by each of the 
other computers. 

For inter-computer communication transfer, each computer 
communicates with all other computers. A computer can thus 
pass data to all others, request data from the other computers 
and perform any set of integrated tasks. No subsystem is connected 
to the ICC buses. 
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The main characteristics of the Space Shuttle orbiter 

avionics data processing system are summarized as follows 

(see Figs. I. 1 .1-1.1 .2). 

1. The avionics system provides data processing capabilities 
for guidance, navigation and control (GN&C); communications 
and tracking (C&T); displays and controls (D&C); system 
performance monitoring; payload management; payload 
handling; subsystem sequencing; and selected ground functions. 

2. The system accepts input commands and/or data from the crew, 
on-board sejisors, and external sources. 

3. The system performs computations and processing. It generates 
output commands and data as necessary to accomplish the 
requirements specified for the above mentioned tasks, as 

well as for any required internal purposes. 

4. The system is topologically structured around a central 
set of five general-purpose computers (GPCs) which are 
interconnected to the subsystems so that they may be 
operated in redundant groups to provide critical sources. 

Each computer has a memory capacity of 65,000 32-bit words. 
Additional storage of programs and fixed data is provided 
by two mass memory units, each having a data capacity of 
134 megabits. 

5. Data transfer between the computer center and the data users 
is through a data bus network. This network is composed 

of serial, half-duplex data channels operating at a rate 
of 1 megabit/sec. 
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6. Interface adaptation between the data bus network and the 
orbiter subsystems is accomplished by multiplexer/de- 
multiplexer (MDM) units. These units provide signal 
conversion capability, digital -to-analog (D/A) as well 

as analog-to-digital (A/D), and multiplexing/deniul tiplexing 
functions. 

7. Engine interface units provide operational control of the 
main engines from GN&C commands. The units also provide 
main engine data for recording, telemetry or GSE. 

8. Incorporated in the system are also dispaly electronics 
units, CRT displays, keyboards, manual controls and controller 
manipulator instrumentation units. 

1.1.2 Characteristics of the Computer System 

We have indicated in the previous section that the heart of 
the Space Shuttle avionics processing system is a set of five general- 
purpose computers (GPCs). Four of these computers can operate in a 
parallel redundent mode during flight critical phases of a mission. 

We summarize in this section the major characteristics of these 
computers, on board the Space Shuttle orbiter. 

The following are the principal characteristics of the on- 
board GPCs. 

1. The GPCs are designed as adaptation of the IBM AP-101 
computer. 

2. Computer size is 0.87 cu.ft., and weight 57.9 lbs. Input 
power is 350 watts. 


3. The computer uses transistor-transistor logic, medium and 

large scale integration, and multilayer interconnection boards. 
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4. Data flow is in parallel. 

5. Both fixed point and floating point arithmetic can be 
used. 

6. Data word length (fixed point) is equal to 16 or 32 bits. 

Data word length (floating point) is equal to 32 or 64 bits. 
Instruction word lengths are equal to 16 and 32 bits. 

7. There are 154 instructions in the computer instruction 
repertoire. 

3 

8. The computing speed is equal to: 480 x 10 operations/sec, 

3 

under fixed-point; 325 x 10 operations/sec, under floating-point. 

9. The computer incorporates as special architectural features: 

microprogramming, a higher order language, 24 general registers 
and 19-level interrupt structure. As support sof.tware it 
contains: an assembler, a linkage editor, a simulator, a 

self-test program, a functional set and a compiler. 

10. Memory is in the form of pluggable ferrite core modules. 

Memory capacity - 1310720 bits 

= 40960 52-bit words 
Memory access time = 0.375 ysec 

The main characteristics of the computer system on-board 
the Space Shuttle orbiter are summarized by the following. 

1. Multiple high performance computers are used to provide the 

total computing capacity, and system flexibility and reliability. 

During critical phases , four of the computers operate 
in parallel, and "voting" is used. During non-critical phases, 
a simplex mode is implemented. One computer is then used for 
GNC tasks and one for system management tasks. 
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Separate input/output (I/O) processors (lOPs) are used for 
information transfer and control. Each GPC consists of two 
separate processing units: a central processing unit (CPU), 

which provides the central computational capability, and an 
input/output processor (lOP), which performs and controls 
the I/O operations for the CPU. 

Time-shared serial digital date buses are used to accomodate 
the data traffic among the computers and between the computers 
and other subsystems. 

There are 24 data buses, organized into 7 groups. The 
data 'transfer is time-division multiplexed (TDM) using .pulse 
code modulation (PCM). Each bus operates at a. clock rate of 1 Mbit/sec. 
Microprogramming is used for both the CPU and the lOP. This 
allows the implementation of a comprehensive instruction 
repertoire. 

Both floating-point and fixed-point arithmetic operations are 
provided in the CPU for easier programming and program validation. 

A higher order language is used in the programming of the CPU 
to reduce software effort and yield better control. This 
language is designated here as HAL/S. 

As main memory, random-access non-volatile destructive-read- 
out ferrite cores are used. They provide maximum reliability. 

Also, high capacity mass memories are used for permanent 
on-board off-line bulk storage to supplement the on-line 
random-access computer main memory. The mass memories are 
two identical tape units. 
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A functional block diagram of the GPC, showing the inter- 
connection between the CPU and the associated lOP is shown in 
Fig. 1.1.3. Concerning the CPU-IOP system, the following char- 
acteristics are noted. 

The primary communication interface between the CPU and its 
lOP is provided by a 36-bit bi-directional data channel. 

The main properties of the CPU have already been indicated 
above. We further note that the computer has a 96K fault detection 
capability, achieved by built-in test equipment and self-testing 
programs. 

All data transmission among GFCs and between GPCs and the 
avionic subsystems is performed by the lOPs under CPU control. One 
lOP is associated with each CPU to provide direct and passive 
monitoring of data traffic. 

Each lOP interfaces with the other lOPs and with the interfacing 
subsystems over the 24 separate serial data buses. The lOP contains 
a set of 24 independent processors, called Bus Control Element 
(BCE) processors. A 25th processor, the Master Sequence Controller 
(MSC) controls the operation of the 24 BCEs. These 25 processors 
act, in effect, as 25 digital computers and operate from software 
programs stored in main memory. The lOP data processing prograins 
are independent of the CPU programs and have their own unique 
instruction set. Each BCE controls a Multiple, '<er Interface 
Adapter (MIA), which is connected to the serial data bus via bus 
computers (see Fig. 1.1.3), The MIA transmits and receives inform- 
ation, encodes and decodes bus data, and tests for parity and proper 


synchronization of bits. 
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1.2 TRAFFIC, TASK AND SUBSYSTEM MODELS AND PARAMETERS 

1.2.1 The Network Components 

In this section we present the main system parameters and 
statistical distribution functions necessary to construct an 
analytical model for the space Shuttle data processing system. 

In particular, our interest here is to construct proper queueing 
models that will enable the system engineer to predict and evaluate 
the delay-throughput performance of this computer network. The 
relevant set of performance measures will be presented in the next 
section. 

In providing the parameterized models for the system 
components, we classify them into three categories. 

T. The general purpose computers (GPCs) and the computer 
subsystem (complex). 

2. Terminals, tasks, users and peripheral equipment. 

3. The communication subnetwork. 

We now consider each of these categories. 

1.2.2 The Computer Subsystem 

The main characteristics of the computer subsystem have already 
been presented in section I.l. For obtaining a global network 
model, we choose the following model and parameters. 

The model is shown in Fig. 1.2.1. The model enables us to 
statistically describe theprocessing services provided by the CPU 
and lOP, the task queueing delay characteristics, buffer overflow 
properties and the CPU-IOP interactions. Data and requests for 
service arriving at the GPC subsystem are stored in the lOP 
queue. Any required lOP processing is granted to the tasks 
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waiting at the lOP queue in accordance with the specified service 
ordering discipline. The latter incorporates fixed (static) 
priorities as well as dynamically assigned priority functions. 

Subsequently, upon terminationof the desired lOP service 
portion, the task (or job, or message), or a request associated 
with it, is stored at the main queue waiting to be granted 
service by the CPU. The desired CPU service can involve a 
certain computational effort as well as memory extraction and 
accessing duties. The requests or data stored in the main queue 
are served in accordance with the underlying priority service 
discipline. Between various CPU service periods, the processing 
of the underlying task can stop so that certain lOP services 
or memory accesses could be completed . This is introduced into 
the model (see Fig. 1.2.1) by allowing a CPU-IOP-CPU cycle as 
well as a CPU-Memory-CPU cycle. Upon termination of its service 
the task data output is stored at the output buffer. It is 
transmitted to its destination (properly controlled, as well as 
time-division-multiplexed by the computer lOP controls) at the 
proper output times. 

Major parameters of interest are denoted as follows. 

= memory access time [sec] 

Cj - lOP service rate [bits/sec] 

Cq = CPU service rate [bits/sec] 

Mq ■ Size of amin CPU memory [bits] 

Mj = Size of input buffer facility [bits] 

Mq = Size of output buffer facility [bits) 



Some of these parameters can be random, in which case we 
are interested in their probability distribution functions, or 
just their means and variances. 

The processing times requried at the CPU and lOP levels 
depend on the task under consideration. Considering a task of 
class k, distinguished by its proprity and desired response time 
and criticality, we are interested in the following parameters. 
Henceforth we identify memory processing, accessing and interruptions 
as I/O duties. 

Sj(k) = lOP total service time requried by a class k task 

(request, message), including memory service time [sec]. 

Ti(k) = lOP continuous service portion required by a class k 
task, including memory service time [sec]. 

t 

Sj^(k) = CPU total service time requried by a class k task 
[sec] 

T^(k) - CPU continuous service portion required by a class k 
task [sec]. 

K(k) = Number of times that a class k task required interruption 
in CPU processing for lOP or memory processing. 

The parameters mentioned above are random variables. We are 
interested in their probability distributions, their means E(-) 
and variances Var(*)- The associated means (average values) of these 


parameters are denoted as follows. 



E[Si(k)] = \ik) = 

Tj(k) [sec] 

(I. 2. 2-1) 

E[Tj(k)] = Tj(k) = 

p‘^(k) [sec] 

(I. 2. 2-2) 

E[S^(k)] = S^(k) = 

T(,(k) [sec] 

(I. 2. 2-3) 


J^inCc 


om 


E[Tp(k)] = T.(k) = y "'(k) [sec] 


We then obtain the following relations: 


Tp(k) 

K0<) = 

yQ (k) 


(I. 2. 2-5) 


Tj(k) = K(k)p;’(k) = p‘^k)T^(k)/p-l(k) . (1. 2. 2-6) 

1.2.3 Terminal, Task and User Traffic 

Data traffic distribution within the Space Shuttle avionics 
data processing network can be associated with a number of 
classified "processes" or tasks. Tasks are divided into task 
(or message) classes in accordance with their: 
proprity; 

scheduled/unscheduled status 

message characteristics, such as message lengths and 
desired response time. 

Tasks can be assigned priorities on a f i xed static level. 

Then class 1 tasks have higher priority over class 2 tasks. 

Priorities can also be assigned on a dynamic basis (see sections 
I.4-I.5 for classification of priority disciplines and the 
associated queueing analysis). For example, a dynamic Earlier 
Due Date dyanmic queueing priority discipline can also be used. 

Then, each task (or job, or message) is associated with dynamically 
changing priority level expressing the criticality of the job as 
well as its desired due date (response time). (See Section 1.5 
for details.) As a particular case, the following priority classes 
can be defined. . 
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Class 1 tasks ■- highest priority tasks, critical. 

Class 2 tasks = timely, become critical after a delay 
of 62 sec. 

Class 3 tasks = timely, but become noncritical after a 
delay of sec. 

Class 4 tasks = timely, discarded after a delay of 5^ sec. 

Class 5 tasks = noncritical. 

To implement a dynamic queueing priority service discipline, 
the network controller is designed to administer demand-assignment 
assessing and service ordering procedures. 

Jobs, or tasks, are also classified in nature as being 
cyclic or acycl ic ( not cyclic ) Cyclic jobs require service on 
a periodic basis. Acyclic tasks use the processors on an 

I 

aperiodic basis. 

One also distinguishes between schedul ed and unscheduled 
tasks. Scheduled tasks can be cyclic or acyclic. They cover the 
following four areas. 

•User interface tasks. 

•System control tasks. 

•Guidance, navigation and control tasks. 

•System management tasks. 

Tasks (jobs, or processes) are activated by either internal 
or external stimuli. The computer processor and the data 
network are assigned to tasks on a priority basis, as 
indicated above. Service of a task, or process, can be 
preempted (interrupted) by higher priority tasks. Certain 
tasks can be served on a non-preemptive basis. Each task is 


assigned to a "service class" and given priority within the class. 

In addition to representing " processes " requiring service 
by the Avionic DPS as tasks, one also identifies the information- 
bearing units called routines and messages . Routines serve as 
modules executed in performing a task. They can be included 
or shared among several different tasks. Messages are defined 
to be groups of data handled and transmitted within the data 
processing network. Messages can be declared as elements of 
certain tasks. 

The devices associated with the Space Shuttle orbiter 
Avionics DPS are described as follows. 

•15 MDS (Multiplexer/Demultiplexer Units). Max. record size 

= 1024 bytes. 

Input/Output rates = 120 bytes/msec 

Can be shared among tasks. 

•4 DEUs (Display Electrical Units). Can be shared among tasks. 

Max. record size = 8192 bytes. 

Input rate = 120 bytes/msec. 

Output rate = 62 bytes/msec. 

•3 DDUs (Dispaly Driver Units). CAn be shared among tasks. 

Can hold an unlimited record size. 

I/O rate = 120 bytes/msec. 

•3 KBUs (Keyboard Units). 

Output rate = 1 byte/msec. 

Associated delay of 1 msec. 



2 PCMMus (Pulse Code Modulation Master Units). 

Can be used by all tasks. 

Max. record size for each unit = 2048 bytes. 

I/O rate = 120 bytes/msec. 

Display data can be classified as follows. 

Time critical display data. Memory resident, accessible 
within n-i sec. Typically, n-| =1 sec. 

Sequence critical data. Accessible within ri 2 sec. Typically, 
r )2 = 2 sec. Can be resident in memory, if requried. 
Noncritical data. Accessed as soon as possible. Access-time 
can be minimized by tape head positioning and file ordering. 

In the keyboard subnetwork , a message is composed of a key- 
stroke or a series of keystrokes sent to the GPC system by a DEU. 
The DEUs are pul 1 ed by the GPCs. Polling frequency is 

f(DEU) polling times/sec 

For example, in certain operational modes one sets 

5 times/sec £ f(DEU) _< 10 times/sec 

A given DEU receives commands from only one GPC on its bus. 

DEU transactions can be very long. It is subsequently 
important to evaluate the probability of overflow of the 
associated I/O buffer. 

Update data from GPCs to DEUs is transmitted at one of a 
number of possible rates. Typically, the rate is 2 Hz for 
analog data, and is equal to anyone of 1 Hz, 0.5 Hz, 0.25 Hz, 

0.125 Hz for digital data. 
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Dedicated displays are updated regularly by the GPCs. 

Dedicated control inputs are polled by the GPCs at proper 
polling rates. 

The role of process management is to supervise the allocation 
of the internal computer resources and control the execution of the 
application processes. For that purpose, use is made of dynamic 
queues and tables containing the state of the internal resources. 

Process control is responsible for allocation of the GPCs 
to application processes. This is accomplished according to (the 
above-mentioned) preassigned process (task) priorities, controlled 
by the demands of the crew, scheduled ducies and conditions polled 
in the avionics equipment. 

Scheduled processes in queues are noted to be in one of three 
states: 

• Active state; the process controls the CPU. 

• Ready state; the process is ready to utilize the CPU, but 
has not attained control yet. 

• Wait state; time must pass until a certain event occurs or 
an I/O operation is completed. 

1.2.4 Task and Application Process Parameters 

According to the descriptions of the nature of the application 
processes and tasks in the previous section, the following parameters 
are defined. These are the major parameters used in a macroscopic 
performance analysis of the avionics data processing system. 

Different tasks make different service demands upon the data 
processing network. Tasks are divided into priority (or service) 


classes. GPC service times required by a class-k task have 
been defined in Section 1.2.2. In particular, we have: 


E[S(k)] 


E[Sj(k)+S^(k)] 

T(k) = Tj(k) + T^(k) = average total GPC service 

time required by a class k 
message; (I.2.'4-l) 


Var[S(k)] = 


where 


Var[Sj(k)+S^(k)] 

V(k) = variance of the total GPC service time 

required by a class k message; (I. 2. 4-2) 


S(k) = Sj(k) + S^(k) = total GPC service time required by a 

class k message. ■ (I. 2. 4-3) 

In addition to using GPC resources, a class k message 
might require various network and device resources. The above 
service times describe the overall time required by a task in 
directly utilizing the CPU (through S (k)) or in requiring any I/O 
processing (through Sj(k)). The local behavior and buffer overflow 
characteristics of each device will also be modelled. 

In addition to characterizing the task service times, one 
also needs to statistically describe the stochastic process of 
task request times. 

The stochastic arrival process {t^(k) ,n=l ,2, . . . } is described 

as follows. The time t (k) denotes the instant of time at which 

n 

the n-th task (or, job message) of class-k signals its request for 
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service. This signaling can be realised by the actual arrival 
of request for service at the GPC» actual arrival of the proper 
data (or message), or any such scheduled arrival. 

The interarrival times {Tj^(k) ,n»l ,2,, , , > are defined by 


t(j(k) ^ 0, 


T,/k) 


Tluis. T„(k) Is 


= tj^(k) - n » 1,2,... (1. 2. 4-4) 

in general a random variable denotinq the 


time between the arrival of the n~th class k message and the arrival 


of the preceeding (n-U-st class k message. We usually assume 
{Tj^(k)} to be a sequence of independent identically distributed 
random variables. We then setj 


T'j^(k) = E[T(k)] “ average interarrival time for class 

k messages 5 

Vy(k) ^ Var[T(k)] variance of the intev'arrival time 

for class k messages. 

The arrival traffic associated with c ycl i c tasks can further 
be character tied as follows. The starts of the requests for 
service of a class k cyclic task are again governed by the stochastic 
arrival stream Ctj^(k)} and the associated interarrival times {T^^fk)}. 
However, once servico has started, for a certain cyclic task, the 
service requirement is specified by? 

T^(k) » time between required services of a class k cyclic task 

® time period associated with a class k cd^clic task; (L:i.4“7) 

x^{k) « service time of a cyclic class k task within a single 

associated period. (I.2.4~S) 

■. ; , A ... 


(I. 2. 4-5) 
( 1 . 8 . 4 - 6 ) 



T(.(l<) = E[x(.(k)] = mean of xj^(k) ; (1. 2.4-9) 

Var[xQ(k)] = variance of T^(k) . (1.2.4-10) 

Fig. I. 2. 4.1 illustrates the evaluation of service times 
required by a class-k cyclic task. 


Tc(k) 

h? 

.c(k) 


Tj(k) T(,(k) 

->t< >t< H 

-< — HWAM 1 — 

-H h- “H , h" 

x^(k) 



Fig. I. 2. 4.1. 

We note that m can allow the periodic times x^(k), dedicated 
to servicing a cyclic class k task, to be identical or of random 
varying durations. 

The arrival times {t^(k)} and associated interarrival times 
CTp(k)} for scheduled tasks can be regarded to be fixed deterministic 
values. This is observed by noting that the signals indicating 
request-for-service by scheduled tasks are issued at a priori 
known fixed instants of time. 

Arrival times {t^(k)} and interarrival times {T^(k)} of requests 

for service of non-scheduled tasks are regarded as random variables. 

The mean and variance of the interarrival times, T(k) and V.^(k) have 

been defined by(1. 2.4-5) &(l. 2. 4-6),respective1y. It can be beneficial 

for the advanced performance analysis to also have the interarrival 

time distribution function F.^ i.(x), assuming {T (k)} to be a 

I j K n 


p. 
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sequence of i.i.d. random variables; thus, 


= P{T^(k) < X> , X > 0 


( 1 . 2 . 4 - 11 ) 


Unscheduled tasks are many times assumed to arrive according 
to a Poisson process at a rate of X(k) [mess. /sec.]. Then, we have 


F^_k(x) = 1 - X > 0 , 


(I.2.4-1Z) 


so that the interarrival times are exponentially distributed. Mote 
that 

Mk) = fE[T(k)]r'' = [T(k)]"’' 

= average number of class k task arrivals 
per unit time (sec) 

Cyclic tasks are statistically characterized by {T^(k) ,x^(k) } 
within each activity period. For unscheduled cyclic tasks, one can 
assume requests for an activity period to start at random times 
distributed according to a Poisson stream with intensity tQ(k) 
[requests/siic]. 

When considering the buffer beahvior at a device, the following 
statistical characterizations are required. 

= storage capacity of the buffer associated with device i. 

= interarrival times of tasks (message) at device i. 

Fj^^(x) = P{Tj^^(x)}, Var(T|*^) = distribution, mean 


and variance of 


(i) 


= interdeparture times of tasks (messages) out of the 
buffer of device i, 

Fq^^x), Var(TQ^h = distribution, mean and variance of 
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= processing time at device i. 

Fg^^(x), Var(sj^^) = distribution, mean and 

f c ( i ) 

variance of Si . 

Note that task polling processes can be modelled as cyclic 
processes, using the characterizations presented above, 

1.2,5 The Communication Subnetwork 

The communication subnetwork is composed of the subsystem that 
provided for the transmission of information between the GPCs and 
the users, terminal and application devices. 

For the Space Shuttle DPS, the Avionics communication sub- 
network is composed of a network of bus lines. A bus line connects 
all computers to a certain device. The lines are used in either a 
command or a 1 isten mode. In a command mode the line 'use is supervised 
and controlled by a commanding GPC to transmit or receive information. 

The other computers can listen. In the listen mode, a computer can 
only receive data over the line. 

The rate of transmission of data over each bus line is 1 MHz. 

To study the utilization of each bus line, we set: 

f{i) = rate of transmission of information over bus line 

0) [bps] . ■ (1.2. 5-1) 

^ ^ !'(■>) = average rate of data transmission over a 

bus line [bps] (I, 2. 5-2) 

where 

N = number of bus lines (connecting GPCs and devices). (1.2. 5-3) 

In addition, one is interested in the utilization of the ICC 
(inter-computer communication)! ines. For which we set; 




fj = average rate of data transmission over an ICC line [bps] 

(I. 2. 5-4) 

Each bus line serves as a half-duplex communication channel. 

It can also be modelled as a multiplexed set of half-duplex sub- 
channels. 

We set: 

C^(i) = transmission rate over the i-th bus line [bps] (I. 2. 5-5) 

A|^(i) = bit time lag over the i-th bus line [sec] (1. 2. 5-6) 

P|_(i) = probability of a bit error (due to noise, 

bursts, interruptions) on the i-th bus line. (1.2. 5-7) 

The topological structure of the communicationsubnetwork is 
specified by a connectivity matrix 

C = [c..] ‘ (1.2. 5-8) 

where 


! 1, if node i is connected to node j 
0, otherwise . 


The nodes in our network are the application devices and the 
processing GPCs. 

In particular, we have 




= degree of node i 


number of lines connected to node i. 


(I. 2. 5-9) 


The degree d^ of node i represents the number of lines connected 
to node i. For certain nodes, this number is limited by physical, 
performance and reliability constraints. 
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A routing procedure (or algorithm ) needs to be specified 
for directing the information between the GPCs and the application 
devices. Involved in this algorithm is the selection of the trans- 
mission path. Related to it are the tasks of performing memory 
allocation, task scheduling, unit selection, element loading and 
I/O services. 

In the Space Shuttle orbiter avionics communication subnetwor'". 
there are 27 data link buses. There are also 11 half-duplex links 
for interdevice communications. The data links are divided as 
follows. 

•5 data buses for ICC, max. transmission rate = C = t MHz. 

•4 data buses for display system communication, C = IMHz. 

•8 data buses for flight critical communication, C = 1 MHz. 

•2 data buses for mission control communication, C = 1 MHz. 

#2 data buses for mass memory communication, C = 1 MHz. 

•2 ground interface buses, C = 1 MHz. 

•4 PCMMU communication buses, C = 1 MHz. 

•4 data links for communication between DEUs and DUs. 

C = 800 Kbps. 

•5 data links between DEUs and KBUs, C = 800 bps. 

•2 data links between PCMMUs and Instruments. 

C = 800 Kbps, 


n. 




1.3 PERFORMANCE MEASURES 


1.3.1 Computer Oriented Performance Measures 

The computer complex in the Space Shuttle orbiter avionics 
system is the most crucial subsystem in the network, in determining 
the network performance. We will define this seciton the major 
computer oriented performance measures. In the following sections 
we will define user (or task) oriented and subsystem (or network) 
oriented performance measures. 

It is important to know the extent to which we utilize the 
computing, processing and storing capabilities of the computer 
system. The following performance indices will refer to any 
arbitrary GPC. This is also equivalent to considering the 
4 GPCs as a single computing machine for the modes in which the 
4 computers are used in parallel as a redundant set. ' 

The index of utilization of a GPC, U,.., is defined by 

= relative time during which a GPC is used 
= P{a GPC is busy}. (I. 3. 1-1) 

Note that 

0 < Uj, £ 1 

Similarly, the index of utilization of an lOP (Input/Output) 
processor is defined by 

UjQp = relative time during which an TOP is used 

= P{a lOP is busy} (I. 3. 1-2) 

Note that 0 £ £ 1. 

The index of utilization of a CPU is given as 





(I. 3. 1-3) 



= relative time during which a CPU is used 
= P{a CPU is busy). 


Also, 0 _< U^p^j <_ 1 . 

The overall GPC system is composed of the CPU, lOP and 
associated memory and storage facilities. One can thus define a 
GPC to be busy if either its CPU or its lOP, or both, are busy 
(i.e., used for processing, computing or active storing). Then, 
we will have 

1 - U^ = (T-UjQp)(l-U^py) ( 

so that 

U^ = 1 - (1"UjQp)(l-U^py) 

^CPU ^lOP ' ^lOP^CPU ^ 

It is also many times of interest to find the statistical 
characteristics governing the use of GPC. We identify alternating 
idle periods and busy periods in observing the use of CPU, lOP 


and the GPC buffers. We then define: 

^CPU* VarlB^py) = mean and variance of the busy-period 

duration B^py for the CPU (1.3 

T^py ,Var(Icpu) = mean and variance of the idle-period 

duration I^py for the CPU (1.3 

Fjyp, Var(BjQp) = mean and variance of the lOP busy- 

period (1.3 

Tiyp, Var(Ijgp) - mean and variance of the lOP idle- 

period (1.3 

B"p, Var(Bp) = mean and variance of the GPC busy-period (1.3 


.1-4) 

.1-5) 

. 1 - 6 ) 

.1-7) 

. 1 - 8 ) 

,1-9) 

.T-TO) 
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Var(Ij,) - mean and variance of the GPC idle- 
period 

It is important to also measure the utilization of the 
memory and storage devices. For that purpose, the following 
performance indices are defined. 

UMj, = index of utilization of the GPC 

= memory average fractional part of the GPC memory 
which is not used. 

UMj = index of utilization of the GPC input buffer 
POFj = probability of overflow of the GPC input buffer 
UMq = index of utilization of the GPC output buffer 


(1.3.1-11) 


(1.3.1-12) 

(1.3.1-13) 

(1.3.1-14) 

(1.3.1-15) 


POFg = probability of overflow of the GPC output buffer (1.3.1-16) 


The GPC throuc 


index is used to assess the average amount 


of data processed, and tasks performed, by the GPC per unit time. 
Thus : 

THg - the GPC throughput 

= average number of bits per sec served by the GPC 

We can also consider the number of tasks per unit time performed 
by the computer: 

TTHg = the GPC task (job, message) throughput 

= average number of tasks (jobs, messages) processed 
by the GPC (or computer complex) per sec 

1.3.2 User Oriented Performance Measures 

The major index of performance associated with a user or a 
task (job, message) is the associated task time delay. 


— " -a ' 
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Tasks (jobs or messages) are classified into classes (as 
detailed in Section 1.2) in accordance with their priorities , 
criticality and required time delays . 

The response-time or time delay of a class-k task is denoted 


by 


~ time-delay , response- time of a class k task 

(message, job) (I. 3. 2-1) 


The response-time D(k) is the period of time measured from the 
instant of the class-k task records its request for service to the 
instant its service has been completed. 

We also set: 


W(k) = waiting- tim e of a class k task 

= time from the instant the task request is recorded 

I 

to the instant its service starts (I. 3. 2-2) 

Thus, W(k) denotes the time duration that a class task is delayed 
until its processing has started. 

The processing time required by a class k task has been 
defined (see (I. 2.4. 3)) as S(k). We then have that 

D(k) = W(k) + S(k) [sec] (I. 3,2-3) 

The time-delay and waiting-time functions are random variables. 

We are generally interested in their distributions; 


D,k^^) = P{D(k) < x> , X ^ 0; 

(I. 3. 2-4) 

lj^,^(x) = P{W(k)<x} ,x>0. 

(1. 3. 2-5) 


In particular, it is of interest to use as a performance measure 
the user average time-delay. We set: 

n. 
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D(k) - E[D(k)] = average task k time-delay (response 

time); 

W’(k) = E[W(k)] = avej'age task k waiting time 


Since 


(I. 3. 2-6) 
(I. 3. 2-7) 


S(k) = average processing time required by a class k task, 
we have 

D(k) = W(k) + T(k) . (I. 3. 2-8) 

It is also important in many cases to evaluate the variances 
of the task delay and waiting times: 

Var[W(k)], Var[D(k)]; (I. 3. 2-9) 

Var[D(k)] = Var[W(k)] + Var[S(k)] . (1.3.2-10) 

The standard deviation of the class-k task response -time is 
then given by 

cr(k) = ' Aar D(k) . (1.3.2-11) 

In measuring the peak task response-time, one is interested in the 
probability 

P{lD(k) - ^(k) j}> tt , (1.3.2-12) 

expressing the probability (fraction of time) that the response 
time deviates from its average value by a. By Chebychev's 
inequality, we conclude that 

P(|D(k) - D(k)l > 3a(k)} < i 11% (1.3.2-13) 

Therefore, we can estimate the peak delay of a class k task by 
setting 


. 



•• '^p(k) = D(k) + 3a(k) . (1.3.2-14) 

Relation (1.3.2-13) indicates that more than 89 % of the time 
the delay D(k) v/ill be lower than this value. 

Other user related performance measures can be defined in 
relation to specific modes of operation. In certain cases, some 
tasks are rejected f or processing. We then set: 


Pp(k) = probability that a class k task is rejected. (1.3.2-15) 

Certain devices, or terminals (users) experience local 
queueing phenomena. Considering device i, one then defines: 


Ug(i) = index of util ization of the device i buffer; (1.3.2-16) 

MQ(i) = average occupancy of the device i buffer; (1.3.2-17) 

POF(i) = probability of overflow of the device i buffer; (1.3.2-18) 

User related reliability measures are of prime importance as 
well. These will be detailed in the section on network reliability. 

In particular, it is of interest to specify and compute the following 
measures: 

L(k) = probability of loss of a class k message . (1.3.2-19) 

LQ(k) = probability that class k message (job, task) 

does not receive service within D sec. (1.3.2-20) 

1.3.3 System and Network Related Performance Indices 

The reliability issue of the topological structure of the 
network gives rise to a number of invulnerability measures. 

In particular, one defines: 
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K(i) = minimal number of line failures that disconnect 
device i (or application process i) from the 
computer complex; (I. 3. 3-1) 

P|^(i) = probability that device i, or application 

process i, will be disconnected from the processing 

resources (due to line failures, terminal failures 

or GPC failures). (I.3.3-E) 

An overall netvvork throughput measure is 

TTH - average number of tasks processed by the 

system per unit time. (I. 3.3-3) 

We can then write 

TTH = (1. 3.3-4) 

k 

The network delay measures are specified by the values {"^(k)}, 
{l}(k)+3g(k)}. Indices of utilization of the GPC memory and buffers 
and the device buffers have been defined above. 

Performance indices indicating the sensitivity of the network 
operation to fluctuations in traffic ar-e important. For that purpose, 
we set 

a’D(k) = change in the average class k message delay as a 
result of the increase of the overall traffic rate 
according to (AX(k) [mess. /sec]}. 

aTTH - change in the network throughput with the a increase 
of intensity of task demands. 

Also of importance are measures indicating the growth 
capability of the network. In particular, we set; 
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AFg(Di^) ,aMc(D|^) = average growth allowed in occupancy of 

computer buffer (B), or memory (C), attaining 
average task delays not higher than 

A U^(Dj^) ,AU(i ,D|^) = average growth allowed in index of utilization 

of GPC elements (UPC, lOP, buffers), or 
device i elements, causing message delays 
not higher than 

The following sections will present proper queueing models 
to be employed in analyzing the Space Shuttle DPS. We will also 
present performance analysis results for such models. The 
network and computer complex designer will then be able to apply 
the proper model to the underlying subsystem he is analyzing. He 
will subsequently be able to compute the set of relevant performance 
measures indicated above. In particular, note the following main 
families of performance indices that we defined above, and will 
compute in the following sections. 

•Task (job, message) response times (queueing and service 
time delays). 

•iSystem Throughput. 

•System indices of utilization. 

•Reliability measures. 

•Performance sensitivity measures, 

•Network growth measures. 
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1.4 TIME-SHARING QUEUEING MODELS 

1.4.1 Time-Shared Single Processor Systems 

We consider a queueing mode! for a system wheren the serving 
resource is modelled as a single server queue. This resource 
can model the GPC of the Shuttle orbiter avionics systems. The 
service provided by the latter includes the relevant GPC CPU 
and lOP processing functions. Demands are made upon this single 
server processor by the arriving messages or requests. Due to 
the finite resources available to the server, and its finite 
processing rates, arriving messages will have to be queued at a 
buffer before they can be processed. A scheduling algorithm needs 
then to be devised to control the assignment of service resources 
to the arriving messages and demands. 

We consider in this section such schedul ing algorithms that 
use the service facility on time-shared basis. 

The general structure of the queueing model is shown in 
Fig. 1.4 1 
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Figure 1.4 1 

In the typical time-shared system, one generally wishes 
to attain a message average queueing delay (response time) which 
is proportional to the average message length. Thus, short 







cyUUzC^C 


om 


messages expect to experience short waiting-times, while long 
messages are prescribed longer time delays. This is achieved by 
the feedback queueing model shown in Fig. 1. 4. 1-1. 

In this time-sharing system, the server (GPC) allows a 
message to stay in service (be processed) only for a certain 
time period, called quantum . The quantum duration may vary, and 
it can depend upon the state of the system, the message priority 
and the message past processing record. If the processing time 
required by the message or request is not satisfied by the end of 
the quantum service period, the message is returned to the queueing 
(storage) sy^:tem, where it joins the queue of messages waiting 
for service. Otherwise, the processing required by the message 
has terminated and it leaves the GPC to its destination. 

1.4.2 Traffic and Performance Parameters 

We need to statistically describe the stream of message 
arrivals at the server, and the service (processing) demands made 
by each message. 

For that purpose, we generally assume message interarrival 
times to be independent identically distributed random variables. 

The message interarrival time distribution is set equal to 

A(t) = P{interarrival time < t> . (1.4. 2-1) 

Message service times, or required overall GPC (processor) times, 
are generally assumed to be independent identically distributed 
random variables, for any specific class (or priority group) of 
message. We then set the message service time distribution to be 

B(t) = P{message service tiniest} . (1. 4.2-2) 
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It is common to assume that messages arrive according to a 
Poisson process with intensity A [mess. /sec]. This amounts to 
assuming an exponential distribution for the interarrival time: 

A(t) = 1 - , t>0 (1. 4. 2-3) 

A Poisson arrival streams models a complete random stream of 
arrivals (in that the interarrival durations follow a memoryless 
distribution) . 

It is also many times convenient, to simplify analytical 
studies, to assume the required message service time to be 
exponentially distributed. Then 

B(t) = 1 - e'^^ , t ^ 0 , (I. 4. 2-4) 

and we set the 

Average Message Service Time = [sec/mess.] . (I. 4. 2-5) 

In a time-sharing system, the quantum service provided by the 
processor is usually set equal to a constant A, or is defined as A 

pn 

to depend upon the message priority class p and upon its number (n) 
of prior entries into service. Also included in this quantum duration 
is the swap time period, spent in transferring messages between the 
queueing and service facilities. 

The main performance measure used in this section is the 
message average time delay (response time) D. It represents the 
average overall time spent in the queueing system by the server. 

The average time spent by a message is waiting a.t the queueing 
facility is denoted by W. The average message service time is S. 

We clearly have 
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D = W + S 


(I. 4. 2-6) 


As a major objective of this feedback system is to attain a 
message time delay proportional to the message length T,DaT, we 
can represent explicitly the delay and waiting time measures for 
a message as functions of its required service time T,and denote 
then by D(T) and W(T), respectively. We have 


D(T) = W{T) + T (I. 4. 2-7) 

In the following, we describe certain useful scheduling 
algorithms for time-sharing systems, and indicate their performance 
characteristics. 

We assume messages to arrive according to a Poisson process 
with intensity ^ [mess. /sec], 

1.4.3 Batch Processing: First-Come First-Served 

The structure of the basic queueing system, where no feedback 
is employed is shown in Fig. 1.4-2. 



Fig. 1.4-2. 

Messages arriving at the system are stored in a queue. They 
are served by the single server (processor) on a first-come 
first-served basis. Once a message is accepted into service, 
it is allowed to complete its processing. (The quantum is thus 
of infinite duration.) The average message response time D(S) 
is given by the well-known Pollaczeck-Khintchine formula: 
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where 


+ T, for p < 1 , 


p = AS = traffic intensity parameter; 


(I. 4. 3-1) 


(I. 4. 3-2) 


00 

S - J' tdB(t) = average message processing time; (I. 4. 3-3) 


2 

S = I t dB(t) = second moment of message processing 

r\ j. ^ 


time. 


(1.4. 3-4) 


Note that 


, 


(I. 4. 3-5) 


where a is the standard deviation of the message service time. 
The traffic intensity parameter p yields the ratio 


_ average message processing time 
average message interarrival time 

It is a measure of congestion in the system. We obtain 


W=D = «>, ifp>l , 


so that arbitrarily high time delays are experienced, as the system 
evolves in time, by messages if p >_1. Hence, we are interested in 
operating the system such that 0 £ p < 1, and finite queueing 
delays result. 

We note that the average message response time depends only 
on the first two moments of the required message processing time, 
and not on its distribution. 

The message average waiting time W(T) = D(T)-Tj is given by 
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W(T) - 2 ^^ . for p < 1 . (1.4 

For the special case, where message processing times follow 
exponential distribution (2.4), we have 

S - i = 2(l/p)^ . (1.4 

SO that the message waiting time is given by 

W - 4^ , where p = - < 1 . (1.4 

i-p y 


Thus, for this (FCFs) model, the waiting time W is independent 
of the message required processing time T. 

As noted in Fig. 1.4-3, the message waiting time function W 
becomes a very sensitive function of p as p approaches 1. Thus, 
the messsage queueing delay increases very fast as the system's 
congestion approaches its saturation value. One should therefore 
design the system so that it avoids the traffic intensity region 
close to saturation, i.e., close to p=l . 



The average queue-size parameter X, describing the average 
number of messages in the system, is given for the latter queueing 
system by 
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X = , for p < 1 

1 -p ^ 


(1.4.3-11) 


Note that for p = 0.8, only an average of 4 messages are in the 
processing system (queued or being Processed) , while for p = 0.9 
and 0.99, the average queue size is equal to 9 and 99, respectively. 
We also have that 


P{ system is empty} = 1-p , for p < 1 , (1.4.3-12) 

so that 

U = P{ system is busy} = p , for p < 1 . (1.4.3-13) 

Thus, p serves as a measure of system utilization. For p < 1, the 
processing system is kept busy a fraction p of time. For p = 0.8 
and 0.9 the processor is busy 80% and 90% of the time, respectively. 

Clearly, one must compromise between having high enough a 
processor utilization factor and low enough message response times. 

The system utilization index U = p is also shown in Fig. 1.4-3. 

1.4.4 Round-Robin Processing 

In a Round-Robin (RR) processing, the processing (GPC) facility 
serves each message for a fixed quantum period A. Newly arrived 
messages join the end of the queue. When they arrive at the end 
of the queue they are sent into the processing facility where they 
are served for a period of A sec. Then, if their service demand 
is fully satisfied, they leave the system. Otherwise they are cycled 
back to the end of the queue, starting again the same queueing-service 
process. A RR system structure is illustrated by Fig. 1.4-4. 


cjCinC^c 


ont 


CYCLED MESSAGES 


NEW 

ARRIVALS 



Figure 1.4-4. 

The RR service discipline can also be regarded as a processor 
sharing service procedure. To explain this notion, we note that 
when there are n messages in the system, and if A is small, each 
message is in fact processed (served) by the processing facility 
at a rate of ^ sec/sec. Thus, we can regard the processor as 
shared among the various messages on an equal basis. 

For a round-robin system with arbitrarily small A value, the 
average message delay D(T) is given by 

D(T) = iTj- , for p = JS < 1. (1. 4. 4-1) 

where T is the required message processing time. The average 
message waiting time is then equal to 

W(T) = . for p = AS < 1 . (I.4.4-2) 

Thus, the RR system yields a message response time which 
is linearly dependent on the required message processing time T. 


For exponentially distributed message service times, we can 
note that messages requiring shorter (longer) processing times than 
the average one will experience shorter (longer) response times in 
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a round-robin system than in a first-come first-served system. 

1.4.5 Round-Robin with Priorities 

We divide the arriving messages into P priority classes. A 
p-priority message is a message which belongs to priority group p, 
where p is an integer in {1,2,...,P}. A p-priority message is 
considered to have higher priority than a q-priority message if 
p < q. 

We assume P streams of message arrivals at the single server 
queueing system. The stream of p-priority messages is taken to 
be a Poisson process with intensity Ap [mess. /sec]. 

Assume p-priority messages to have exponentially distributed 
processing (service) times with mean [sec/mess.]. 

A p-priority message is assigned an rp fraction of the 
processing time. We can choose rp as desired, setting higher rp 
values for higher priority (lower p) messages. 

For example, let fp be an arbitrarily chosen function that 
sets higher values to higher priority (lower p) messages. When 
'there are x^. messages at the system from the i-th group, i=l,2,...,P, 
we set the fraction rp of processing time dedicated to the p-priority 
customer to be 


(I. 4. 5-1) 


£ f.x. 
i^l ^ ^ 


Thus, we have specified a processor sharing system where the share 
of the processor assigned to each message depends upon its priority 
group. 


a 


The average delay Dp(T) for a p-priority message is then 


Dp(T) - ^ 


T 


1 + 


,t (J; -') -] ■ 


(I. 4. 5-2) 


where 





i=1 


(1.4. 5-3) 


Thus, the message response time again depends linearly upon the 
message service time T, as for the round-robin system. But, in 
addition, we obtain the message response time to depend upon 
the message priority class. 

By properly choosing +he discrimination function fp, we can 
separate as we wish between the response-time vs p curves of the 
various priority classes. Typical curves for the message waiting- 
time functions W are shown in Fig. 1.4-5. 



Figure 1.4-5. 
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1.4.6 A Round-Robi!i Scheme with Time-Varying Priorities 

We can assign a time-varying priority index to each message, 
depending on whether he is being in processing or stored in the 
queue. Thus, the priority of a message is set to increase linearly 
at a rate a whenever it is waiting in the queueing facility. 

His priority is, on the other hand, set to increase at a lower 
rate 3 , where 

a ^ 3 ^ 0 , 

when it is in the service (processing) facility. 

Service is provided to all messages in the system which 
presently have the highest priority. When more than one message 
have the present highest priority value, all the Tatter are 
served in a round-robin fashion, thus sharing the processor 
resrouces. 

An entering message will then increase its priority at 
a higher rate than those currently served. Eventually, this 
message catches-up with those being processed. Then it is 
entered into the service facility and remains there until its 
service demand is satisfied. 

The average message delay D(T) in this system is given by 

1 / ^ ~ ~ 

^ e; • 

where “ 

p = A/p < 1 , (I. 4. 6-2) 


A is the intensity of the Poisson message arrivals, and message 
service times are exponentially distributed with an average 
message processing time of 



S = — [sec/mess,] . 

The average message waiting time is 

W(T) = D(T) - T (I. 4. 6-3) 


The dependence of the message waiting time W(T) on the requried 
processing time T is shown in Fig. 1.4-6, where the ratio 3 /a 
is a parameter. 



Figure 1.4-6. 

We note that a message which requires a processing time T 
equal to the average one, T = 1/u, will experience the same 
response time under any 3 /a value. 

When 3/a =1, the oldest message in the system captures 
the processor and uses it for itself alone. Hence, we obtain 
a FCFS queueing scheme. 

When 3=0, we obtain a round robin scheme, since a message 
does not gain priority while being processes. 

Changing 3 /a between 0 and 1 we obtain response time curves 
that vary continuously between those of a FCFS system and a RR 
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1.4.7 Foreground-Background Processing Schemes 

In a foreground-background service scheme, we have two 
queues. A newly arrived message joints the first queue. It 
receives there its first quantum of service , being served on 
a FCFS basis. Thereafter, the message joins the end of a second 
queue. From then on the message can join only the second queue. 

The processing facility always serves first the messages in the 
first queue, called also the foreground messages (or jobs, tasks). 

When the first queue is empty, the processor turns to serve the 
background messages queued in the second queue. 

The generalized foreground-background (FB) scheduling 
scheme described next is structured so that service is always 
given to that message which has so far received the least service 
of all . 

A new message which finds the system empty is given the full 
attention of the processor. It is served at a rate of 1 sec/sec. 

If prior to thetermi nation of its service, a new message arrives, the 
processor gives its full service to the second message. This 
continues until the second message has received the same service 
time as the first one. Subsequently, if there are no new arrivals 
and these messages have not yet departed, the processor is shared 
among these messages, yielding each service rate of 1/2 sec/sec; 
and so on. Thus, the processor always serves those m.essages 
that have so far received the least service. 

The average message response time D(T) is given by 

D(T) = (1. 4. 7-1) 

l~Pj 






T 


A(T) = 


{I. 4.7-2) 


S{T)2 = / 


' :.2.n,..X . x2 


x‘^dB(x) + rn-B(T)] , (I.4.7-s3) 


T 

Siry = J xdB(x) + T[1-B(T)] 


(1.4. 7-4) 


Pt = XS(T) < 1 , 


(I. 4. 7-5) 


where B(x) is the distribution function of the message processing 


time. We note that 


S{“) = S, S(»)^ = ? , = p 


A(») = W(FCFS) 


where W(FCFS) is the message waiting time in a FCFS queueing system. 
A typical curve of D(T), for exponential service times, is 


shown in Fig. 1.4-7. 



Figure 1.4-7 



,J!inCt 


om 


One computes the slop of D(T) to be equal to L at T=0 and 
to 1/1 -p at T = Thus, a message with a very short required 
processing time is given here a service rate close to unity. On 
the other hand, messages requiring very long processing times 
has to wait until all the messages arriving during its requried 
service time are first processes; thus experiencing a service 
rate equal to that given to it in a RR scheme. 

I . 4 . 8 Multilevel Processor Sharing Schemes 

A family of multilevel processor sharing service disciplines 
can be defined by dividing the message (or job, task, process) 
processing time into the {a^} values: 

0 = < a, < 

We; now define N+1 scheduling procedure. The i-th procedure (SP)^. 
Is applied when the message has been received the service value 
of X in the interval 

1 X < a^ , i = 1 ,2,. . . ,N+1 . 

We can set (SP)^ to be either FCFS, FB or RR. Also, 
between these intervals messages are treated as foreground- 
background jobs, so that the processor gives its complete 
attention to messages in the lowest level nonempty queue. 

As a FB discipline is used between level, one can observe 
that the message response time depends only on the discipline 
used when it departs from the system, after receiving its complete 
processing requirement. 

Subsequently when a message departs at the i-th level. 
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receiving there FCFS service, his delay time is given by 


A(a.) + T 


where A(x) is given by (I. 4. 7-2) and p by (1.4. 7-5). 


When the message last level i uses an FB discipline, D(T) 


is given by the FB formulas, assuming the entire level below 


to use FB procedure. 


For exponentially distributed message processing times, one 


can note the response curves D(T) for FCFS multi levels or RR 


multilevels to be close to the response curve obtained when a 


single FB service discipline is used. 


One can properly choose the various levels and associated 


disciplines so that a D(T) curve with certain desired characteristics 


are obtained for the underlying processing system. 


1.4.9 Comparing the Performances of the Time-Sharing Schemes 


The message delays experienced under the various time-sharing 


schemes presented above can be compared as follows. 


For messages that require very short processing times the 


foreground-background (FB) service discipline yields the shortest 


response times.. Comparable performance is exhibited then also by 


a round-robin (RR) scheme. 


For messages that require long processing times, the first- 


come first-served (FCFS) service disciplines yields the lowest 


response time values. This is also the case when medium-valued 


required service times are involved. 


The round-robin scheme with time varying priorities, also 


called Selfish Round Robin (SRR) scheme, as well as the Multi-level (ML) 




scheme, yield D(T) curves that are between those of the FCFS and 
FB ones. Figure 1.4-8 illustrates the typical situation. 



In designing the time-shared processing part of our computer 
system, we can thus attain the proper response time D(T) vs requried 
processing time (T) curve, by choosing the proper multi-level (ML) 
scheme, or jsut a FCFS, RR, SRR or FB scheme. The optimal choice 
can be made based on the presented results, for each traffic- 
message environment under consideration. 
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1.5 PRIORITY QUEUEING MODELS 

1.5.1 On Service Disciplines 

Messages or requests for service arriving at the central 
processing (GPC) system are queued (stored) in a buffer until 
the processor is ready to serve them. These messages need to be 
properly ordered for service. This ordering follows the service 
discipline ^ or scheduling algorithm , governing the operation of 
the underlying queueing system. 

A multitude of service disciplines can be defined and imple- 
mented in our data processing system. Different disciplines will be 
required at different times, while different jobs and tasks require 
service. It is thus of importance to implement a dynamic (flexible) 
scheduling rule . 

In this section we classify and discuss some of the priority 
service models of importance and relevance to the Space Shuttle 
orbiter avionics system under consideration. 

In designing a scheduling algorithm, one can assume the a priori 
distribution of priorities among the various messages (or tasks), 
according to their desired response time and measure of importance 
or urgency. In turn, one wishes to dynamically modify the order of service 
of messages in the system in accordance with the state of the 
system, so that a proper performance measure is optimized. Such 
a performatice measure involves the satisfaction of the required 
response times by the various messages, in accordance with their 
class, urgency and statistical characteristics. 

1.5.2 Scheduling Algroithms for Time-Shared Processing Systems 

We have described in Section 1.4 a multitude of scheduling 
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algorithms for time shared processing systems. We have also given 
there the associated response time functions and compared the 
performance characteristics of the various schemes. 

In these time-sharing systems, the processing center has 
been time-shared among the messages. A single processing unit has 
been assumed. The following disciplines have been noted. 

First-come first-served (FSFS) service discipline . Messages 
are served in order of arrival. Thus, messages are queued at 
the storage facility in the order of their arrival. When the 
processor becomes free, the message at the head of the queue is 
accepted for processing. 

Round-robin (RR) service discipline . The processing system 
is time-shared among the messages in the system on an equal 
basis. Thus, if n messages (tasks) are in the system (requiring 
processing), each message is processed at a rate of 1/n sec/sec. 

Round- Robin with Priorities . The processor service time is 
shared among the messages (jobs) in the system in accordance with 
the message priority class. Thus, messages which belong to a 
higher priority class are assigned a higher service rate. 

Round- Robin with Time-Varying Priorities . Messages that are 
currently being processed are assigned a lower shared processing 
rate than messages that have just arrived. 

Foreground-Background (FB) Processing Schemes . Messages are 
stored at two different queues. Newly arrived messages are assigned 
to the first queue which is always given the higher priority for 
service. They are then given a fixed amount of service time and 
subsequently entered into the second queue. The latter is 
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served only when the first queue is empty. Differentiation is thus 
made between foreground and background service processes. 

Multilevel Processor Sharing. The service discipline assigns 
a proper mode of scheduling rule to the message in the system 
according to the amount of service already given to the message. 

In this way a set of service discipline level is set up, so that 
the different levels are controlled between them by an FB algorithm. 

We have noted the response time experienced by a message using 
the RR, FB and other related time-sharing schemes mentioned above, to 
be proportional to the required message processing time. To obtain 
this property, the various schemes utilized a feedback service 
procedure. In this way, a quantum of service is given to each 
message at a time. Such a procedure needs to be adopted when we 
have no prior knowledge concerning the message (or job, task) 
required processing time. The feedback scheme estimates this 
time through quantization. 

However, if prior information is available concerning the 
required message service time, a much simpler nonfeedback structured 
scheme can be devised, incorporating this information, to yield the 
same message delay characteristics. This is many times the case 
in our system. Such priority service disciplines will be presented 
in this section. 

We also note that the FCFS service discipline yields a message 
waiting-time which is independent of the message required processing 
time. 

We can also consider a data processing system with a set of 
processors available to serve the messages. The queueing scheme 
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then involves multiple service channels. The resulting queueing 
characteristics are similar to those mentioned in Section 1.4, 
except that the number of messages can be processed simultaneously, 
so that the multi-processor yields a higher service rate. 

Another service discipline that could have been mentioned is 
the last-come first-served scheduling procedure. This discipline is 
noted to yield a response time vs required service time D(T) function 
which is identical to that obtained by a round-robin scheme. The 
schemes however yield different message delay variances. 

We thus note, as will be observed again later, that to 

compare various priority service disciplines one needs to compute 

% 

and compare also the variances associated with the message delays. 
1.5.3 Service Disciplines for Messages in Different Priority Classes 

Messages are many times classified into different priority 
classes. This classificationis affected by the message index of 
urgency and importance and by the message required response time. 

Priority-1 (or class-1) messages have higher priority than 
priority-2 (or class-2) messages. In general, if there are P 
priority classes, we assign a higher priority to service class-k 
messages over class-j messages, whenever k < j. Messages 
belonging to the same priority class can be served according to 
sny pre-ass igned priority procedure. In particular, we assume, 
unless stated otherwise, that a FCFS service discipline controls 
the service of messages belonging to the same priority class. 

In considering the service of messages belonging to different 
priority classes by a single processing center, vve can distinguish 
between the following disciplines. 
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Nonpreemptive Priority Discipline . Messages are ordered for 
service in accordance with their priority class. When the processor 
becomes available for service it accepts the message of highest 
priority. A FCFS ordering is used among messages of the same 
priority class. If, however, a higher priority message arrives 
at the system while a lower priority one is being processed, the 
latter is not preempted and its processing is allowed to be 
carried to completion. 

Preemptive Resume Priority Discipline . The scheduling 
algorithm is as above except that if a newly arrived message 
belongs to a higher priority class than that presently in service, 
it is allowed to preempt ^he currently served message. The pre- 
empted message joins the queue, and when accepted for service its 
service resumes from the point it has been interrupted. 

Preemptive Repeat Service Discipline . This scheduling 
procedure operates as the preemptive resume one except that 
the service of a message that has been previously preempted starts 
from the beginning. Thus, all the processing provided to a message 
is assumed lost if this message gets preempted by a higher priority 
message. 

1.5.4 Analysis of a Priority Queueing System 

Consider a system where messages are classified into 2 priority 
classes. Class 1 messages require high priority, while class 2 
messages are of low priority. 

Messages of class l arrive at random at the system according 
to the statistics of a Poisson process, with an arrival intensity of 
X-j [mess. /sec]. Class-2 messages arrive independently, also according 
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to a Poisson stream, with intensity [mess./sec]. 

The system is assumed to provide a single server; i.e., a 
single processing facility. Messages of class 1 and 2 may require 
different (random) processing times. Thus, we set: 

S.| = average processing time required by a priority-1 

message (I. 

$2 = average processing time required by a priority-2 

message (I. 

The corresponding second moments of the required message processing 
times are denoted as 

S? = E(S?) , S? = E(S^) (I, 


The average waiting time experienced by a randomly chosen 
message is denoted by W. Its average time delay (response-time) 
in the system is denoted by D, The average waiting time and time 
delay of a priority-1 message is and D-j , respectively. Similarly 
the average waiting time and time delay of a priority-2 message is 
W 2 and D 2 , respectively. 


We can write: 


D = W + S ; 


•^1 "^1 ^1 ’ 
^2 ~ W2 + S2 , 


where S is the average service time of a message chosen at random. 
Also, since a message chosen at random will belong to class L with 
probability A-j/a, where A - A-| + A 2 , and to class 2 with probability 
A 2 /A = 1-A-j/a, the following relationships hold 


n. 







w = 

_JL u 

X ”1 

X 

W2 ; 

(I. 5.4-7) 

s = 

^1 

X 

s 

(I. 5.4-8) 

D = 

^1 

r‘>i 

X 

“2 • 

(I. 5.4-9) 


The traffic intensities of the lower priority and higher 
priority schemes, pg and , are given by 

P-j = X-jS^ ; (1.5.4-10) 

P2 = • (1.5.4-11) 


The traffic intensity p associated with the combined stream of arrivals 
is equal to 


P P"! p2 ^1^1 ^2^2 


(1.5.4-12) 


We assume the processor to employ a non-preemptive priority 
service discipline. 

If p-j 1 , then high and low priority messages will experience 
arbitrarily long time delays as the system evolves in time. Thus, 

^1 ~ ^2 ~ P] it ^ • 


If however p-| < 1 , the higher priority message experiences a 
finite waiting time given by 


A|S^ + X 2 S 2 


2(1-Pl) 


for 


= A.jS-j < 1 


Also, we have 


(1.5.4-13) 


P(Xi=0) = P(Wi=0) 


1-Pi > 


for p, < 1 


(1.5.4-14) 





where X-j denotes the (queue-size) number of class 1 messages in 
the system. Thus, with probability 1-p-j there will be no higher 
priority messages in the system. The average delay of a priority 
message is given by (Pollaczek-Khintchine equation) 


D, = W, 


+ S, = 


A,s2 + 




2(1-P,) 


+ S-, 


for 


Pi 


< 1 


(1.5, 


The average number of priority messages in the system, denoted 
as "Xp is equal (by Little's Theorem) to 

^1 ~ ^ Pi " ^1^1 


T 

^1^1 

2(1-Pl) 


+ Pi 


(1.5, 


where p,j = A,|S,| < 1 
If 


P = p-| + p2 1 1 > 


class-2 messages will experience arbitrarily long queueing delays, 
so that 

~ W2 = ■» for p ^ 1 (1*5 


For p < 1, the average waiting time for a lower priority message is 
given by 


T 


''2 2 


2(I-p,)(l-p) 


for p < 1 


(1.5, 


The response time of a lower priority message is thus equal to 

2 


D, 


A-jS^ + A2S2 


m'-Pi)(l-p)' 


+ s. 


for 


= A-jS,|+A2S2 


< 1 


4-15) 


4-16) 


4-17) 


4-18) 


(1.5.4-19) 
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The average overall queue-size, i.e., average number of 
both class messages queueing in the system, is equal to 

X = • "" ••• 2(1 -p) P’ for p < 1 (1.5.4-20) 

The probabilty P(X=0) that the system will be totally empty is given 
by 

P(X=0) = 1-p = 1- X^S^-X^S^ , (1.5.4-21) 

for p < 1. Therefore, the system index of utilization U, is 
expressed as 

U = P(X > 0) = p = x^S^ + X 2 S 2 • (1.5.4-22) 


Thus, the central processor is kept busy in serving (processing) 
both priority and regular messages a portion U = p of the time, 
when p < 1. (For p ^ 1, clearly U=l). 

The average waiting time W of a message chosen at random is 


given as 


Xi ^2 

W = 


X S2 


X2S2 


Td-p) 



(1.4.5-23) 


for p < 1 . 

It can be noted that if S^=$ 2 , the waiting time W is 
identical to that obtained under a FCFS service discipline. The 
variance 0 ^ the message waiting time, when considering a message 
chosen at random, is however lower when a FCFS scheme is used rather 
than a priority scheme. Of course, the priority scheme yields 
average waiting time values lower than W for higher priority 
jobs, while corresponding higher waiting time values are 
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experienced by lower priority tasks. 

1.5.5 The Earliest Due Date Scheduling Discipline 

It is necessary in a number of operation modes of the Space 
Shuttle avionics system to implement a dynamic priority queueing 
discipline. Subsequently, the computer and network resources are 
assigned to the users on a dynamic basis, based upon the current 
state of the network, the current queue sizes and experienced job 
delays and the current requirements for service. The job currently 
in the queue is chosen to be processed by the computer system in 
accordance with its spent waiting time in the system, priority, 
required response time, required service duration and the 
similar characteristics of the other jobs presently queued for 
service. 

A general model of such a dynamic priority service discipline, 
which is particular important for the proper operation of the Space 
Shuttle data network in high traffic and critical phases, is 
described in the following. It is described as an Earliest Due 
Date (EDD) service discipline. 

Jobs (or task, or messages) arriving at the processor are 
classified in'^o k classes. A class i job is associated with an 
urgency number u ^- , i = l,2,...,k. Let 


Ui £ 


u^ <_ 


< u, 


(I. 5. 5-1) 


The 1 ower the urgency number, the more urgent is the required 
service. If a class i job arrives at the system at time t^ , he is 
assigned a real number 


d. = t. + u. 
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(1.5. 5-2) 
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This number d ^ can be regarded as a dynamic priority number, 
i 

Under an associated head of the line discipline, the processor 
admits into service the job with the minimum value of {d^.=t^+u^>. 
Ties can be broken by choosing the job with the minimum urgency 
number. If preemption of a job (from service) is allowed, the 
scheduling procedure is modified so that the system is continuously 
monitored, and the job that is being processed has the minimum 
value of {t|+u^. } out of all jobs in the system. 

In comparing this dynamic priority scheduling rule with the 
static priority discipline presented in Sections 1.5. 3-1. 5. 4, we 
note the following. In applying the dynamic queueing rule, the u^. 's 
serve the purpose of distinguishing between static priority classes. 
Thus, a class 1 job is of highest static priority and a class k 
job has the lowest static priority. But in addition a job that 
has been waiting for service as reflected by its arrival time t^ 
gains in priority dynamically over time. 

For our purposes, it is generally convenient to let the u^. 's 
correspond to the interval until the due date is reached. Thus, 
a class i job arriving at time t^. has a due date t_j+u^. desired for 
receiving service. Subsequently, we can choose u^. to reflect the 
desired response time and priority of class i jobs, in relation to 
the other jobs. 

As such, this priority scheme is noted to realize scheduling 
by the earliest due date (EDO) rule in the processing queueing 
system. 

As special cases, if we set u..=0 for each i , this service 
discipline becomes a FCFS scheduling rule, while if U 2 -u-j ” +« 
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we have a static priority service procedure where class 1 jobs 
are always processed ahead of class 2 jobs present at the same 
time. Thus, by changing the difference in urgency numbers 
from 0 to + , the discipline evolves from a FCFS one to a 
static priority one. 

We indicate now a few performance characteristics associated 
with the FDD discipline. Assume k=2, so that we have only 2 
classes of jobs. Class 1 jobs are higher priority jobs with an 
urgency number u-|-U|^. Class-2 jobs have lower priority and an 
urgency number u^ = u^^^. We let W^(t) denote the waiting time 
(in the queue, prior to initiation of service) of a higher- 
priority (class 1) job arriving (virtually) at time t. Similarly, 
we let W^(t) be the waiting time of a class-2 lower priority job 
at time t. We can also consider a non-preemptive or' preemptive- 
resume service discipline. The waiting time at t of a higher 
priority job under preemptive-resume and non-preemptive discipline 
is denoted as W^^ p(t) and W^ ^(t), respectively. The waiting 
time W^(t) of the lower priority job is clearly independent of 
whether a preemptive or non-preemptive procedure is employed. 

We find that for t u^. 

To demonstrate further these inequalities we define the lateness 
of a lower priority job L^(t) and higher priority job n^^^’ 
Lh,pn). at t, by 

L,(t) ■ W,(t) - u, ( 
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= - u 


= “h.p't) - ^ 


(I.5.5-4b) 

(I.5.5-4C) 


Thus the lateness L(t) of a job at time t describes the difference 
between the job waiting time in the queue and its urgency value. 

If the urgency value corresponds to a desired expected job waiting 
time, then the lateness variable describes the deviation of the jub 
waiting time from its desired expected value. We then have, for 




L (t-u ) < L. (t-u, ) < L, (t-u. ) . 

V V - h,pV - h,n^ h^ 


(I. 5. 5-5) 


Thus, a lower priority job arriving at time t-u^ will be served no 

later than a higher priority job arriving at time t-u^^ ^ 

\ 

class-2 (lower priority) job waits at least u^-u^ units of time, 
his due date becomes the same as that of a class-h job arriving 
u^-u^ units later, and the jobs are then of equivalent priori ty. 
Subsequently , equality occurs above and the above mentioned class-1 
and class-2 jobs experience the same lateness values. 

Thus, note that lower priority (class 2) jobs increase their 
dynamic queueing priority index after waiting in the queue u -u^^ 
units of time so that at this time they atta indynamic priority 
equivalent to that of higher priority (class 1) jobs. 

To indicate explicit analytical results, we assume cTass-h 
and class-ii jobs to arrive according to Poisson streams with 
intensities [jobs/sec] and A^^ [jobs/sec], respectively. Also 
assume required processing times of [sec/job] and S [sec/job] 
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for high priority and low priority jobs, respectively. The related 
moments of the required processing times are denoted as 
E{S^), E(S^) and E(S^). The traffic intensities are 




’h ’ 


(I. 5. 5-6) 


(1. 5. 5-7) 


as we assume henceforth, the system will enter steady-state where 
finite job waiting-times are experienced. 


" = "£ - "h • 


(I. 5. 5-8) 


We denote by B^(W) the busy-period duration spent in servicing 
only newly arriving class-h jobs, starting with an initial service 
load of W sec. Then the (steady-state) mean waiting' times of 
higher priority jobs under a non-preeemptive rule, E(Wj^ ^), and 
of low priority jobs, E(W^), are given as follows. 


E(Wh^n^ E(W) - pj P{Bj^(W) > y}dy , 

0 

E(W^) = E(W) + PCB^(W) > y}dy , 


(I. 5. 5-9) 


(I. 5. 5-10) 


where E(W) is the mean waiting-time of an arbitrary job in the 
combined-traffic queueing system, given by the Pol laczek-Khintchine 


formula as 


A,E(s2) + A^E(s2) 




(1.5.5-11) 
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The distribution of the busy-period B^(W) can be calculated 
by considering the associated (high priority) M/6/1 queueing 
system. In particular, the mean busy-period duration is equal to 

J P{Bf^(W) > y}dy = E[B^(W)] - < 


We can thus write for each u > 0, 


J P{B^(W) > y}dy = g(u) 


’-Oh 


The function g(u) is defined by the above equation, and is clearly a 
continuous monotone increasing function of u assuming values in [0,1]: 

0 5 g(u) _< 1, g(0) = 0, g(«) = 1. (1. 5.1 

Substituting in (I.5.5-9)-(I.5.5-10) , we obtain 

E(Wj) = E(W) 1^1 + g(u) (1.5. 


In particular, we observe that 




E(W^) - E(W^_„) = E(W)g(L.) . (I. 

yielding the difference in average waiting times between lower 
priority and higher priority jobs, using an EDO service discipline 
with u = 

Consider now the two extreme special cases. If u = u^-Ui^ = 0, 
we have a FCFS service discipline, and then g(u) - g(0) = 0 so that 


1 





(1.5.5-18) 


E(W,^n) = = E(W) . 


Thus, no priority classes are being distinguished, and the average 
waiting time of any job is given by (1.5.5-11). 

If u = g(u)-+g(“) = 1, and we obtain 

E*“h.n' ■ 2(l-p. ) ’ * 


E(W,) = 


" ^*E(s2) 


'h'Pj,' 


These are the same equations as noted in a previous section for the 

average waiting-times of high and low priority jobs in a system 

with two stationary priority classes. 

It is obvious by (1.5.5-17) that by choosing the urgency 

\ 

difference number u = u^-u^, we can obtain a desired difference 
E(W )-E(W. ) between the average waiting-times of low and high 

1 1 j 1 1 

priority jobs. This difference is 0 when u=0, g(0)=0, and FCFS 
procedure is used. The difference attains its maximal value at 
u=oo, g(«)=l , when stationary priorities are used. 

For example, if p^=0.5, p^=0.4,p = = 0.9, E(W) is 

relatively high and when 2 stationary priorities are used, 
u=”, g(“)=l, we have 


E(W^) - E(W^,n) = 1.8E(W) . 

This difference can be high for certain applications. By using an 
EDO scheduling rule we can choose 0 £ g(u) <_ 1 to lower the latter 
difference. For example, we can set u = u^-u^ to yield g(u) = 0.2, 
and then 
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E(W^) - E(W^^^) = 1.8E(W)0.2 = 0.36E(W), 


which can be acceptable. 

It should be noted that although class-1 jobs experience 
shorter waiting times than class-2 jobs, the same is not true 
regarding the corresponding lateness values. In fact, the lateness 
variable of a low priority job is stochastically smaller than 
the lateness variable (representing p or of a high 

priority job. Thus, 


< P{L^ > X} . (J. 5. 5-21) 

Hence, 

E(L|^)iEa^) = E(W|,) - < E(W) - . (1.5.5-22) 

This property that jobs from the class with the earliest due date 

I 

have the maximum mean lateness, though having the shortest waiting 
times, is desirable from the system's point of view in meeting the 
most urgent needs of the jobs. 

In designing an earliest due date rule we can properly optimize 
the choice of the urgency (due date) parameters {u.j}, as illustrated 
by the following. Assume 2 priority classes, and non-preemptive EDO 
service disciplines. Let > 0, C 2 > 0 represent costs per unit of 

waiting time for class 1 and class 2 jobs, respectively. An overall 
cost value C is chosen then as 

C = c.jE(W^) + C2E(Vl2) - (1.5.5-23) 

We wish to choose u^,U 2 to minimize C. By the above expressions, 
we conclude that we need to minimize 
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.U2-U1 

(c 2 P^ - c^p 2 ) J P{B^(W) > y}dy . 

*^0 

Subsequently, we conclude that C is minimized over all dynamic 
priority queue-disciplines if we choose: 


Ug-Ui = 0 (FCFS, if C2/c^ > P 2 /P 1 ; 
U 2 -U 1 = ® (static priority), if 


(1.5.5-24) 

and dynamic priority discipline, if C 2 /c-j = P 2 /p-j • 

In particular, if we set 

*^1 ^ ^1 ’ ^2 ^ ®2 X^+X^ » 3] > 0, a2 > 0, (1.5.5-25) 

then. If 


E(S^)/a^ < E(S2)/a2 > 

the optimal policy is to set U 2 ~u-j = «> and use a static priority 
discipline. Thus, we then attach always higher priority to jobs 
whose weighted (by a^) requried processing times are shorter. 


(1.5.5-26) 
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1.6 THE COMPUTER SYSTEM: QUEUEING MODELS AND PERFORMANCE ANALYSIS 


1.6.1 Operating Systems 

We consider a computer system, such as that associated with 
the general purpose computer (GPC) of the Space Shuttle avionics 
system. 

The term "process" is used to denote a program in execution. 

The computer system can be defined in terms of the various 
supervisory and control functions it provides for the processes 
created by its users: 

a. Creating and removing processes. 

b. Controlling the progress of processes. 

c. Responsing to irregular conditions that may occur 

during the execution of the process, such as: interrupts, 

arithmetic or machine or addressing errors?, protection 
violations. 

d. Allocating hardware resources among processes. 

e. Providing access to software resources. 

f. Providing protection, access control and information 
security. 

g. Providing interprocess communication and synclironization. 
The computer system software that assists the hardware in 

implementing these functions is known as the operating system . 

To become an efficient processing system, a computer system 
will generally incorporate the following characteristics: 

a. Concurrency - parallel processing. 

b. Automatic resource allocation. 

c. Sharing of resources by more than one process. 
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d. Multiplexing of information over an access channel, 
and providing remote conversational access to 
system resources or processes. 

e. Asynchronous operation. 

f. Long term storage of information; e.g., in the form of 
a file system. 

These characteristics involve the management of the computer 
memory and processes. Algorithms used to be efficiently designed 


a. Managing, controlling and schedulgin processes; 

b. Managing and controlling main and auxiliary memory 
devices; 

c. Managing and controlling the flows of inforjnation among 
the various devices in a computer system. 

The two important major sets of resources for the computer 
system are processor resources and memory resources . A processor 
is any device which handles information or carries out the steps 
of a process, such as: central processing unit, arithmetic 

processor, I/O (input/outout) processor or an access channel 
A memory is a device which is used for storage of information. 

The capacity of a memory device is the number of words (or bytes, 
or bits) of information that it can store. The access time of a 
memory device is the, average time duration between the receipt 
and completion of a "memory-fetch" request, when queueing delays 
are neglected. A memory device is random access 1f the access 
time of each storage site is the same; examples: semiconductor 
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and core memories. A memory device is positionally addressed if 
the access time of a word depends on its positon; examples: disks, 

drums and tapes. 

Informationis generally stored in a computer system in a two 
level storage system: main memory and auxiliary memory . Information 

residing in main memory is usually random access and requires very 
short access time, so that it can be immediately accessible for 
processing. Otherwise, it resides in auxiliary memory which is 
usually positionally addressed and requires relatively longer 
access times. 

For the GPC on the Space Shuttle, the main memory is composed 
of pluggable, random-access, non-volatile, destructive-read-out 
ferrite core modules with a monolithic option. The access time 
for this memory system is: ’ 

access time = 0.375 ysec . 

The capacity of the memory is: 


capacity = 1310720 bits = 40960 words 


where the word length is: 


data word length = 16/32 bits (fixed point) 

= 32/64 bits (floating point) ; 

instruction word length = 16/32 bits . 

Also, for this GPC we have: 

number of instructions in repetoire =154; 

computing speed 480 x 10^ ^ (fixed point) ; 

= 325 X 10^ (floating point). 


sec 
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On the Space Shuttle orbiter, two high capacity tape units 
are also used as mass memory. The storage capacity of each is 134 mega- 
bits of data. They are used to store permanent on-board off-line 
information. They thus supplement the on-line random-access 
internal memories of the Space Shuttle computers. 

Process coordination characteristics are important in designing 
a computer system and assessing its performance. In a multi- 
programming system, both process interruption at arbitrary times 
and peripheral activity of arbitrary speed are carried simultaneously. 

It is thus necessary to guarantee that the computation performed 
when cooperating processes are involved is independent of the relative 
speeds of the different tasks. Computation then is required to be 
determinate . In addition, in considering process coordination and 
control problems, one should study the following problems: 
deadlocks ; mutual exclusions between tasks; and synchroni zation 
objectives, needed for example to ensure the timing of the proper 
start of a certain procedure in correspondence with the occurrence 
of a certain event. 

1.6.2 Memory Management 

A memory management algorithm is composed mainly of the following 
policies: 

a. The fetch policy determining when a block is transferred 
from auxiliary to main memory. 

b. The placement policy determining the unallocated space 

of main memory into which an incoming block is to be palced. 

c. The replacement policy determining which blocks are to be 
removed from the main memory. 
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The structure of a two level memory system is shown in Fig. 1.6-1 
The above mentioned policies are implemented by move commands which 
control the moving of blocks between main and auxiliary memories. 


PROCESSOR 


REFERENCES 


MOVE COMMANDS 



DATA CHANNEL 


AUXILIARY 

MEMORY 


Figure 1.6-1. 

1 

To analyze the memory management procedure used in the GPCs 
of the Space Shuttle avionics system, it is particularly useful to 
use a virtual memory technique. 

A virtual memory can be regarded as the main memory of a 
simulated (or virtual) computer. A virtual memory system is 
described in terms of two spaces, N and M, and a mapping f. The 
address space N of a task is the set of addresses that can be 
generated by a processor as it executes the task. Tasks can 
share the same address space. In multiprogramming systems, several 
address spaces are utilized. The memory space M of the system 
represents the set of locations in the physical main memory. 

The address map f provides for the transformation 


f; N^MU(4>} 


2 
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from space M to space M or to a set {cfi}. Sets {(|)} indicates that 
the desired word is presently not in the memory space. Thus, if 
X is an address in N, then if 

f(x) = yeM, 

the desired address is stored in main memory at location y at that 
time. If, however, f(x),/M, or f(x)e{({)}, the desired address is not 
in main memory, and a fault condition results. Move commands art 
then initiated and the table describing f is adjusted. This is 
illustrated in Fig, 1,6-2. 


(CMORY_SYST_EM_ 


n 


PROCESSOR 



^ 4 MAIN MEMORY 


AUXILIARY 

MEMORY 


Figure 1.6-2. 


In analyzing the performance of auxiliary memory systems, one 
considers mainly the underlying queueing problems. This is the case 
due to the relatively long access times involved. Subsequently, such 
memory units can become congestion centers within the computer system. 
The models and analysis techniques invovled are similar to those 
presented in the sections on queueuing models and analysis. The 
index of performance usually used in choosing the related optimal 
scheduling algorithm is that of maximizing the throughput of the 
memory subsystem. 
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In our applications the problems mentioned above concerning 
data transfers between main memory and auxiliary memory arise 
also in connection with data flows between main memory and 1/0 
devices. In this connection, one needs also to consider the 
associated buffer problems. Shared (pooled) buffers are much 
more efficient than individual dedicated buffers. The proper 


related performance criteria here aj^e buffer occupancy and 
overfl ow probabi 1 i ty . We note that under multiprogramming the 
main memory can be regarded as a stiared buffer. 

In studying main memory management the objective is usually 
related to maximum execution speed of programs. 

^ ^ • 3 On Computer Scheduling Procedures 

In modeling and studying processor scheduling procedures we 
can distinguish between deterministic scheduling rule's and 
probabilistic scheduling models. 

In considering deterministic scheduling disciplines we assume 
that we are given a (partially ordered) set of tasks whose execution 
(required processing) times {S.} are known. We also assume that 
there are m (identical) processors available to execute these tasks. 

Two performance measures are then considered: the time until 
the last task is completed and the average turnaround (flow) time. 

The first measure is related to the system uti 1 i aati on factor U. 
Thus, If a given schedule finishes in time T, the utilisation factor 
the processors by the schedule is 



iT 
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Hence, minimizing T is equivalent to maximizing T. 

The second measure is of interest to the users of the system. 

It many times also yields the minimization of the average number 
of incomplete tasks. 

The deterministic models assume that the processing times 
of all tasks are known in advance and that all tasks are available 
for execution at once. It is more realistic in our applications to 
assume that the processing times required by the various tasks are 
random, governed by certain probability distributions. Also, we then 
assume that tasks arrive at the processing system at random times. 

We then need to specify the joint statistics of the task inter- 
arrival times. 

s 

Using these probabilistic characterizations of the tasks 
arrival streams and required execution times, the processing 
system is modeled as a queueing system. The associated service 
discipline then represents the task scheduling rule. 

Queueing models have been presented, discussed and analyzed 
in Section 1.4. In particular, time-sharing queueing models have 
been considered. Priority service disciplines have been classified, 
discussed and analyzed in Section 1.5. 

In assigning priorities to tasks, we associate an index of 
preference or urgency to the processing of a task relative to 
other tasks. As noted in Section 1.5, tiiese priority or urgency 
indices can be assigned on a static basis or dynamically in 
accordance with the state of the system and the task desired 
response time or actual current lateness. 

Systems can use "time slicing" to limit the length of 
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processing time that can be given to a task at one time. Tasks 
which use the processor at one time more than a certain quantum 
duration, are interrupted and asked to release the processor. They 
are then reassigned for service in accordance with the system 
service discipline. 

In particular, we have considered the following service 
disciplines: 

1. First-come first-served (FSFS). 

2. Round-robin foreground-background and multilevel service 
procedures . 

3. Service disciplines for tasks classified into fixed 
priority classes. 

4. Earliest due date dynamic priority queueuing disciplines. 

In addition, one can incorporate service discipTines that assign 

dynamic priorities in accordance with task processing times; giving, 
for example, priority to shorter tasks over longer ones. Or, as we 
already noted, giving service to tasks that have currently received 
the least amount of overall service. 

We present and study in the next sections certain queueing 
models that can be used for performance prediction in our data 
processing system. 

1.6.4 A Markovian Queueing Model: Finite Buffer Facility 

We present in this and the next 2 sections a simple Markovian 
queueing model. We also present its performance characteristics. 

It is noted that this model can be used for a first-order performance 
prediction. It allows the incorporation of arrival and service 
rates that depend upon the state of the system. Subsequently, 
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one can use it to analyze a processing system with a finite buffer 


size, multiple processors and an arrival task stream that is 


generated by a finite source of users (or terminals) 


We assume tasks (or jobs, or messages, or customers) to be 


exponentially distributed with mean required task processing 


time being l/y [sec/task]. Tasks arrive at the processor according 


to a Poisson stream. 


Assume a single processing unit with a finite buffer facility. 


with storage space for at most L tasks. Tasks arriving when the 


buffer is full are assumed to be rejected. Let p^ denote the 


probability that n tasks are in the system (at steady state), queueing 


or being processed. Then, we have: 


lipl 


, n = 0,1 ,.. . ,L . 


where p = x/y. In particular, the system utilization index U 


describing the probability that the processor is busy is given by 


“ = ’-Po = ' - 7^ 

1-p 


o-p'-: 


The average queue size X is given as 


= E PPn = 7 
^ 1- 


1-p 

L+1 

-p 


E 

n=0 


The probability p^ that a task is rejected from the system, not 
accepted for service due to a full buffer, is given by 


Pd = Pi 


(1-p) 


0 < p < 


Clearly, Pj^-K) as L while Pr “ P I'o’" L = 1. The character of 


the rejection probability (or overflow probability) curve, as a 







function of the buffer size L-1 , is shown in Fig. 1.6-3. 



The average waiting time W of a task, provided this task 
is accepted into the system (i.e., the buffer is not full) is 
expressed as 

L-1 _i Pp 

n=0 

We thus note that a too small buffer size (L small) can imply 
a very high overflow probability, or rejection probability, as 
illustrated by eq. (I. 6. 4-4) and Fig. 1.6-3. However, increasing 
the buffer size beyond a large enough value L* would not significantly 
improve the overflow probability. 

If a rejection probabilty no higher than p is desired, 





Pr < P . 


(I. 6. 4-6) 


then by (I. 6.4-4) we should set the buffer size L-1 according to 
the formula 



(I. 6. 4-7) 


Note that L-1 is the capacity of the buffer fadlity measured in 
number of messages. The average capacity in bits, Lg, is obtained 
as follows. Assume the (average) processing rate of the service 
facility to be 


processor rate = C [bits/sec] . 


(I. 6. 4-8) 


Then since the average task required processing time is 


average task required processing time 

= [sec/task] , (I. 6.4-9) 

we conclude that 

average task length in bits = [bits/task] . (1.6.4-10) 

Therefore, the capacity of the storage facility in bits is 

Lg = LCy'^ [bits] . (1.6.4-11) 

Also note that as the storage capacity L is decreased the 
average queue size J and the average waiting time W of an accepted 
task both decrease, since accepted tasks have to content for service 
with less other accepted tasks. The overflow probability then 
of course decreases as well. 
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1.6.5 A Finite Task Source Queueing Model 

It is necessary for our applications to be able to model, at 
certain operational modes of the data processing system, part of 
the incoming task stream as composed of a finite set of sources. 

For that purpose, assume that the processor experiences an 
arrival stream that originates from a set of N task sources (or 
terminals). Between the completion of its previous task and the 
submission of a new task to the processor a certain random delay 
time is generally noted. This time delay is called the " think time' * 
source. 

We assume here that the think time of each terminal (task source), 
of the N terminals, is exponentially distributed with mean i.e.. 

Average source think time = [sec] i (I. 6. 5-1) 

The system model is shown in Figure 1.6-4. We note that if there are 
currently n tasks in the system, n _< N, only N-n new tasks can 
presently arrive (according to a Poisson stream with intensity 
(N-n)x). 


COMPLETION^INDICATOR _ 



Figure 1.6.4. 






We assume, as in the previous section, that task processing 
times are exponentially distributed with mean y"^; i.e.. 

Average required task processing time = y"^ [sec/task] (I. 6, 5-2) 

Let 


p = X/y (I. 6. 5-3) 

be the traffic intensity parameter. We also set 

= P{n tasks in the system} , (I. 6. 5-4) 

at steady-state, considering both the task in service and the tasks 
waiting in the queueing facility. Then, we obtain 

pQ = (N-i ) ! P* ’ ' (I.6.5-5a) 

i=0 

P„ = PflP" ]N^ . n = 0.1 N . (I.6.6-5b) 

The system utilization index U is computed as 

U = PTprocessor busy} = I-Pq 
N 

E 

i=0 

The task average waiting time W is equal to 

W = y"^X , (I. 6. 5-7) 

where X is the average queue size, 

H 

X = V nP . fI.6.5-8) 
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If we also assume a finite storage capacity so that 


Number of task in system < L < N , 


counting both tasks in the queueing facility and the task in 
service, we obtain the queue-size probabilities to be given as 
follows. 


I! i 
TTT P 


P_ = P 


n N! 

QP (N-n)! 


, n 0,1,. ..,L , 


The average queue size X is computed using Eqs. (1.6. 5-8) and 
(1.6.5-10). The average waiting time >of an accepted message is 


computed by 


-1 


W = E 

M-n L 


The probability Pj^ that a task will be rejected due to a full 
buffer (or the buffer will overflow ) is equal to 


P = P 
■ L 


T P 


j" 


N! 

liRTT 


Using these expressions, one can properly design the data 
processing system. In particular, if a maximum overflow probability 
Pj^ is specified , one can compute by (5.12) the desired buffer size. 

The latter is equal to L-1 messages or (L-l)Cp"^ bits, where C is the 
processing rate (in bits/sec) of the service system. The system 
utilization index U is now given by 
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U = P{ processor is occupied} 


= 1 


0 


= 1 


( L-1 
^ i=0 


N! 1 

(N-i)! P 


'P- 


(1.6.5-13) 


where p = x/p. 

1.6.6 A Multi -Processor Queueing Model 

In certain operational modes of the data processing system, we 
need to consider the situation where a task can be processed by any 
one of a set of processors. We thus present a Markovian queueing 
model to describe the queueing system performance characteristics 
in this situation. 

Assume the system to contain m identical processors (service 
units). Arriving tasks (or requests for processing) are stored 
in a queue if all m processors are busy. As soon as a’ processor 
becomes free, it accepts into service the task of the head of the 
queue. The system is illustrated in Fig. 1.6-5. 



Figure 1.6-5. 

Assume tasks (or messages) to arrive at the system according to 
a Poisson stream with intensity X [tasks/sec]. We also take the 
average requried processing time for a task to be [sec/tasks]. 
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The required task processing time is assumed to be exponentially 
distributed. 

We then obtain the queue-size probabilities {P^}> where 

= P{n tasks in the system, queueing or being served}, (I. 6. 6-1) 
to be given by the following expressions 


, xm m-1 , xi 

^ (mp) 


. ^ -1 


ni!{l-p) Z-j i! 
i-0 


(1.6. 6-2) 


where 


and p < 1 


^ P n < m 
n! Pq , n < m 


if P Pq > 


p - A/my 


The system index of utilization U giving the fraction of time 
that the system is occupied (so that at least one processor is busy) 
is given by 


= P{at least one processor is busy} 


1 - Pq = 1 - 


JmpJ_ 
m! (l -pi 


m-1 

+ 




with p = A/my <1. 

The fraction of time that all processors are busy, denoted as 

U , is equal to 
m ^ 


J!inCo. 


P{an ni processors are occupied} 


°° m-1 

= E '’n “ 1 - E 'n 

n=m n=0 


Tl-1 

= 1 _ p iOlPJ.' 

' 0 2^ n! 




n=0 

i ) m-1 

" ^ 



- 1 i (mp)'^ + ^ (mp) 

|m!(l-p) 2-# i! 

(mp)*^ 

n! 

(T.6,6-6) 

* i=0 

^ n=0 



The average task waiting-time W 

is computed as 



W = (mv)’^ 

nP . 
n 


(I, 6, 6-7) 


n=in 


For given message and traffic statistical parameters {}, and u)., 
we note that by increasing the number of parallel processors m we 
decrease the task waiting time (and subsequently reduce its response 
time). However, at the same time we obtain a reduced value for the 
index of utilization (or U). In designing the system, one then 
chooses the number of parallel processors m properly, using Eqs. 

(I. 6. 6-2)- (I. 6. 6-7) so that a high enough index of utilization is 
achieved while an acceptable task response time is guaranteed. 

As another useful model for the Space Shuttle processing 
subsystem, assume now that we have m parallel processors as above, 
but that the arrival stream is generated by a finite set of sources. 
As in the previous section, we set the number of task sources to 
be equal to N. The terminal thinking time is taken to be an 
exponentially distributed random duration with mean A ^ [sec]. 
Required task processing times are exponentially distributed with 
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means [sec/task]. 

Then, the probabilities {p^} of the number of tasks in the 
system, being processing or waiting in the queueing facility is 
computed to be given by the following formulas. For 


we have 


p = x/mp < 1 , 


■I 


_m 


N! 


V' — (i^p) 

2-j m! ^ (N-n) ! 2-j (N-n) ! 

n=0 


1-1 


n=m 


, (I. 6. 6-8) 


^0 ^^n! TN-nir ’ n < m 




n nT " n N! .-.e „ . _ 

^0 m! ^ T"N-n)! ’ if n > m 


(I. 6. 6-9) 


The index of utilization U is given as 

m 


= P{all m processors are busy} 


m-T 

’ - E 

n=0 


{I. 6. 6-10) 


The average task waiting time W is computed using Eq. (I. 6. 6-7). 

We again note that the latter equations should be used to design 
the system such that the proper acceptable system utilization and 
message response times are deduced. 

1.6.7 Queueing Models Involving Input/Output and CPU Interactions 
In studying the performance of the Space Shuttle computer 
system, it is of particular importance to incorporate the interactions 
between the input/output and CPU queues. Proper queueing models for 
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describing these interactions, and their performance characteristics 
and formulas, will be presented in this section. 

A basic simple model is that of a cyclic queue involving 
single CPU and I/O processors, shown in Fig. 1.6-6. 



Figure 1.6-6 


Tasks enter the system by joining the CPU queue, but only 
at the instants when tasks depart from the system. In that way 
the number of tasks in the system is kept constant at N. After 
receiving service by the CPU a task leaves the system with 
probability a, and then a new task immediately enters the system. 

With probability 1-a the task processed by the CPU enters the I/O 
queue. There, tasks are served on a first-come first-served 
basis. Upon departure from the I/O processor, a task joins 
immediately the CPU queue. 

We assume that each task is assigned a processing time by the 
CPU and I/O which are independent and exponentially distributed, with 
means and respectively. Thus, 

Avg. CPU service time = y^^ [sec] ; (I. 6. 7-1) 

= y"^ [sec] . 


Avg. I/O service time 


(1. 6. 7-2) 





Since a task will require each titne I/O processing with probability 
l~a, and depart with probability a, we conclude that 


P{task uses CPU i times} = (1-a)^" a, i = 1,2,... (I. 6. 7-3) 


Avg. number of times that a task uses CPU processing = — ; (I. 6. 7-4) 


Avg. number of times that a task uses the I/O processor 


l-c • 


Hence, we obtain 


(I. 6. 7-5) 


Avg. total CPU processing time required by a task 


(ay^.)"' [sec] (1. 6. 7-6) 

Avg. total I/O processing time required by a task = . (1.6. 7-7) 


P^ = P{n tasks in the CPU, queued or being served} . 


(I. 6. 7-8) 


Then, we obtain 


where 


1— p n ^ r\ 1 M 

N+T ^ ’ n-0,l,...,N 

1 - P 


l-a)pQ 


The average time delay (response time) D of a task in the 
system is obtained to be given by 


N 1 - p' 


( 1 . 6 . 7 - 11 ) 








Note that the task response time D is equal to the sum of the task 
overall average waiting time in queues W and required overall average 
processing time. But, 


Avg. required task overall processing time = + 

OC 0 CL ^ 

Therefore, the overall average task waiting time W is 



The CPU index of utilization is given by 

Uj, = P{CPU is occupied} = 1 * Pq 

, N+1 

= 1 I - p _ p - P 

' " T N+1 ■ N+1 

1 - P 1 - P 


(1.6.7-12) 


(1.6.7-13) 


(1.6.7-14) 


By (1.6,7-11) and (1.6,7-14) we conclude that the task response time 
D and the system utilization index U^ are related according to the 
formula 


D = 


ay- 


(1.6.7-15) 


Relation (1.6.7-15) shows clearly how the task response time increases 
with the increase of the CPU utilization factor U^. The above 
formulas need to be used in designing the system so that proper 
response-times and utilization values are attained. 

More involved queueing models representing various interactions 
between a CPU (or several CPUs) and I/O devices can be developed 
using queueing network models. For the purpose of global performance 
prediction for the Space Shuttle computer network, the models presented 
in this section and the one presented in the next section are 
particularly useful. 
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1.6.8 An Analytical Model for the Computer System Performance Prediction 

In studying the detailed behavior of the Space Shuttle computer 
system, one needs to model the interaction between the CPU and I/O 
queues. Such a simple cyclic-queue model, and its performance 
analysis, is presented in the previous section. This model can be 
used for a first-order study of the performance of the underlying 
computer system. 

In this section a more involved cyclic queueing model is 
described and studied. This queueing model also incorporates 
the porper interaction between the CPU and I/O queues. The 
level of detail here is such that it allows the system engineer 
to study changes in hardware configurations and gross changes in the 
software . 

The system model is shown in Figure 1.6-7. Users, or sources, 
request for the processing of their associated jobs. Requests are 
first stored in queue 1, the queue for Main Storage, until space 
becomes available in main storage. After the job enters the main 
storage it actively competes for the use of the CPU or I/O devices. 

The job cycles between use of the CPU and I/O devices until it is 
completed. When a job is completed, a new job from the main 
storage queue replaces it. The source whose job is complete, 
sends a new request for job processing after a random think time 
delay. 

The total number of jobs in the main storage and processing 
facility is limited to M. 

Jobs are assumed to relinquish the CPU to carry out an I/O 
operation. We need to statistically characterize the length 










of each CPU service period between successive I/O operations for 
those jobs in main storage. We assume here that either these 
periods are fixed length or they are random and exponentially 
distributed. 

The I/O service time Sj will also be taken to be either of fixed 
length or random and exponentially distributed. It is also assumed 
here that no I/O queueing occurs. The I/O devices are taken to be 
identical processors operating as M parallel servers. A job requiring 
I/O processing will then be directed immediately to a free I/O 
device. 

In studying the performance of this system, we assume that 
the system is sufficiently loaded so that there are always as 
many jobs requesting processing as the operating system will allow 
in main storage. Thus we take the number of jobs in the main 
storage system to be always M. Consequently, no more than N-M 
will be in think mode at any one time. This can be regarded as 
a fixed multiprogramming level M. 

The model input parameters, reflecting the characteristics 
of the request traffic, required processing times, operating 
system and the hardware configuration, are summarized as follows. 


= Average total CPU time required by a job 
= Average CPU time between I/O operations 
y~' = Average service time for an I/O request 
M = Number of jobs in the main processing system (level of 
multiprogramming) 

N = Number of terminals (sources, users) 

= Average user think-time between requests 
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We also define 


Tj = Average total I/O time required by a job 
K = Average number of times that a job requried I/O processing. 
We then clearly obtain that 


Ti - Kpj - 


Note that the average time values include both the processing 

time of the job itself as well as the overhead time used by the 
system in running the job. 

In studying such a computer system, the performance measures 
of interest are the following ones. 

D = Job response time (sec) 

= Average time delay of a job in the system from entry of 
request to completion of processing. 

T ' Computer system throughput (interactions/sec, or jobs 
served/sec) 

= Average number of jobs departing from the system, per unit 
time, after their processing is completed. 

U = CPU index of utilization 

= Average fraction of time that the CPU is utilized for 
processing 

= Probability that the CPU is occupied (busy, not idle). 

The job response time D is obtained, in terms of the CPU 
index of utilization U, to bo given by the formula 
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r-x"'' [sec] . 


In terms of U, the system throughput is given by 


[interactions/sec] 


To prove and explain relations (I. 6. 8-3) (I. 6. 8-4) we note that 


following. Assume the system to run for a (long) period of x sec. 


During this time assume that J jobs are processed, requiring a total 


time of x-| sec. Then 


processing time 
Elapsed time 


number of jobs served 
Elapsed time 


Therefore 


T " » 


and eq. (I. 6.8-4) results. 


Observe again the system for a period of x sec, during which 


J jobs are processed . During this time each of the N terminals is 


either in a response-time period (waiting for its job to complete 
processing) or in a think-time period (experiencing delay prior to 
the initiation of the next request). We have, during x sec. 


Average number of jobs completed per terminal = J/N . 


Hence, 


Average time taken by a single interaction = 


— X _ Nx 


This time contains both system response time and user think-time, 
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D 


(1.6.8-n) 



We observed in (I. 6. 8-5) that 



(L6.8-12) 


Substituting (1.6.8-12) in (1.6.8-11), we derive (I. 6. 8-4). 

Equations (1 .6.8-3)-( I.6.S-4) thus allow to compute the 
message response time D and the system throughput T, once the 
CPU utilization index U is known. The latter is derived using the 
queueing techniques presented in previous sections. We obtain 
the following results. 

For constant CPU and I/O service times, the CPU utilization 



If we assume CPU and I/O service times to be exponentially 
distributed with means 


Thus, Eqs. (I.6.8-3)-(I.6.8-4) and (I.6.8-13)-{I .6.8-14) yield 
the system performance measures D, T and U. This is expressed here 
in a rather simple form in terms of the major processing system and 
message-traffic characteristics. 
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In designing the system or modifying it to improve its perform- 
ance or increase its capability or efficiency, one incorporates 
the given system parameters of interest and chooses the remaining 
ones to guarantee desired proper values for message delay D, CPU 
utilization U and system throughput T. 


P. 





















The computer model has been explained in Section 1.2 (see 
Fig. L2.1). We have combined here I/O and memory operations and 
accessing functions as single-unit I/O operations. The computer 
subsystem parameters are described in Section 1.2.2. 

The combination subnetwork structure has been outlined in 
Section I.l. The relevant parameters are presented in Section 1.2.5. 
This network is composed of a set of half-duplex data buses properly 
shared by the computers. The buses are made available for inform- 
ation transmission or reception to the terminals at certain times. 

The third subsystem is the user system . Terminals (users, tasks) 
are granted access to the communication network and the GPCs at 
certain times in accordance with their requests for service demands, 
TDM and polling processings, and GPC initiated actions. 

The relevant system performance measures have been represented 
in Section 1.3. 

The heart of the system is the computer complex. We thus 
start by presenting queueing models for the computer system. 

1.7.2 A Time Frame Model for the Computer System 

We need to differentiate between cyclic and acyclic tasks. 

Such tasks have been statistically characterized in Sections 1.2.3- 
1.2.4. Tasks for which computer time is reserved should also be 
described. Within the operational time period under consideration, 
we can thus make the following period definitions. We set: 

Tp = duration of main time cycle (time frame) [sec] 

Tq = duration of the time cycle period which is dedicated 
(reserved) to certain tasks (on a non-contention basis) 

[sec] 
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= duration of the time cycle period which is used by 
cyclic tasks [sec] 

Ty^ = duration of the time cycle period which is used by acyclic 
tasks [sec] 

Thus, we have identified a time cycle (frame) of duration Tp. 

This frame is divided into the following three periods: 

•The dedicated frame period , of duration Tp. This time period 
is reserved for certain tasks (application processes). These 
tasks can be cyclic or acyclic, scheduled or non-scheduled. 

During the period under consideration, the network controller 
assigns this periodic portion of the time frame, on a contention- 
free basis, to these tasks. Included are: scheduled tasks, 

routine updating tasks, routine information flow and processing 
duties, high priority dedicated services, etc. 

♦The cyclic frame period , of duration T^. This period of time 
is periodically reserved for serving cyclic tasks. Service 
time portions within a cyclic frame period are assigned by 
the network controller (GPC) according to service demands 
(scheduled and unscheduled). These assignments are governed 
by the system priority service rules. A cyclic task which is 
assigned service time within a certain cyclic frame period, 
keeps the same assignment in succeeding cyclic frame periods, 
until its processing is completed, or until its service is 
pre-empted by the network controller. 

• The acyclic frame period , of duration Ty^. This period is used 
by acyclic tasks which arrive at random and require service time. 
Time is assigned in accordance with the system priority service 
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We clear'ly have (Fig. 1.7.2) 

Tp = • (1.7. 2-1) 

In the operational period under consideration, tasks with 
dedicated service times (which total sec) do not experience 
any waiting time. We can thus write 

Wp(k) “ 0 , (1.7. 2-2) 

DqM = S(k) , (1. 7. 2-3) 

where 

Wp(k) = waiting-time of a class k task with dedicated service 

Dp(k) = time delay of a class k task with dedicated service 

S(k) = overall GPC service time required by a class k message 

Of course, the length of the period duration Tp assigned for 
dedicated service will affect the overall index of utilization of 
the computer system, as will be noted in the following analysis. 

To obtain the del ay- throughput performance characteristics 
of the computer system, we thus need to study the service of 
cyclic and acyclic tasks. This is carried out in the following 
sections . 



Figure 1,7.2. 


L 7 . 3 Queueing Analysis for Cyclic Tasks: Model I 

We consider, 1n these sections, cyclic tasks which are served 
during the cyclic frame periods. Each cyclic frame period is of 
length sec. Any two consecutive cyclic frame periods are 
separated by a time period of duration 

■^A ^ ^ * (1. 7, 3-1) 

Assume that the computer system can serve cyclic tasks 
during each cyclic frame period. For simplicity of presentation, 
we also assume that equal service times are assigned to all served 
cyclic tasks, during each cyclic period. Therefore, a served 
cyclic task is granted to a fixed service time of duration A sec, 
where 

A . (1. 7.3-2) 

Assume cyclic tasks to arrive at the system according to 
a Poisson process with intensity 

Arrival intensity = ,\^ [cyclic tasks/sec] . (1. 7. 3-3) 

Each cyclic task is assumed to require a service time 
which is exponentially distributed with a mean (required computer) 
service time equal to 

E(Sc) = [sec/cyclic task] , (I. 7. 3-4) 

When any one of the Nq time slots during a cyclic period 
becomes available, upon the termination of service a cyclic task, 
it can be assigned to any one of the queued cyclic tasks waiting 
for service. The queueing system under consideration thus becomes 
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an N^-server queueing system. However, it is not a regular 
multi -server queueing system, since it experiences interruptions 
in service. After s service period of sec., the service granted 
to cyclic tasks is interrupted for a period of Tp-T^ sec. Subsequently, 
service is resumed (simultaneously given to cyclic tasks) for 
another period of sec., and then interrupted again, and so on. 

A proper simple approximate technique for the performance 
analysis of this interrupted multi -server queueing model is developed 
here and described in the following. We consider an equivalent 
non-interruptable queueing system with servers and the following 
parameters. To incorporate the original interruption times, we 
let the equivalent service demand be exponentially distributed 
with mean service time 

E(S') = y'^Tp/Tj, . (1. 7. 3-5) 

The arrival intensity remains equal to [cyclic tasks/sec]'. 

Considering this equivalent queueing model, we perfonii the 
associated queueing analysis and obtain the following results 
(in accordance with the formulas presented in Section 1.6.6). 

By this model, we assume that each served task is processed 
by the computer system for a period of A sec, during each T^ sec 
cycle. A number of cyclic tasks are served simultaneously. 

Each task will thus require an average of 


Avg. No. cyclic periods used by a cyclic task 

-1m 


-1 




(I. 7. 3-6) 


The queueing analysis follows the procedure described in 
Section 1,6.6, when (I. 7. 3-5) is incorporated. The following results 



are subsequently obtained. 


= P{n cyclic tasks in the system, queueing 
or being served} 


Define the traffic intensity parameter p by 




For the system queueing process to be stable , so that queue-sizes 
and task response time would not become arbitrarily high, we must 
require 


Henceforth, relation (1.7. 3-9) is assumed to hold. Then, 

(Ncp/^ _ i\p)' 

'"O “ N !{l-p) “T!~ ’ 

^ i=0 




Pq , if n > 


Nj,! P *^0 ’ ^ 


The GPC index of utilization (see definition by Eq. (I.3.1-T)) 


is therefore given by 


U^(C) = P{a GPC is busy in processing a cyclic task during 
the cyclic period} 

= 1 - Pn , 


(1.7.3-12) 
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where Pq is given by (1.7.3-10). 

The GPC throughput in processing cyclic tasks (see definitions 
(I.3.1-17)-(I.3.1-18)), assuming none to be rejected, is given by 


TTH^(C) = The GPC cyclic task throughput 

= Average number of cyclic tasks processed by the 
GPC per sec 


= cyclic tasks/sec 


(1.7.3-13) 


To obtalf the throughput in bps, we set 


C - GPC average service rate in bps 


(1.7.3-14) 


Then, 


THp(C) = GPC throughput in bps for cyclic tasks 


= bps 


(1.7.3-15) 


The average task waiting time for a cyclic task is equal to 


Average Cyclic Task Waiting-Time 


= W. 




¥c 


Nc-1 


s 


nP. 


n=0 


(1.7.3-16) 


where P^ is given by ( I. 7. 3-10) (I. 7. 3-11 ). 

The average cyclic task time-delay, response time, D^, is thus 
gi ven by 


Dp = average cyclic task response time 


Wc + E(S-) 


-1 
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E 

n=0 


nP 


(1.7.3-17) 
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The variance and distribution of the task response time are 
obtained similarly. 

If a finite source model is desired, the proper formulas follow 
by Eqs. (l.6.6-8)-(l.6.6-10). 

The study of buffer overflow characteristics is illustrated 
by the following model. We let (see also Section 1.2.4) 

Mj, - overall (average) storage capacity for cyclic tasks (1.7.3-18) 

Assume that: Mq > Nq. Thus, assume that no more than an overall 

number of My cyclic tasks can be stored in the system. Then, using 
the queueing models and methods of Section 1.6.6 we obtain the 
queue-size probabilities: 


X 

E 

k=0 


'C 

*E 

k-N. 


(1,7.3-19) 


(N^p)" 




if n < N, 


if £ n £ M^ 


V 0 , if n > 

The probability of overflow is subsequently given by 

h!!‘' m. 

\ = T^! ^0 • n.7.: 

where Pq is given by (1.7,3-19) and p is given by (1.7. 3-8). For this 
system, with a limited storage capacity, it is not necessary any more 


to require p < 1, 
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The 6PC index of utilization and response time are given again 
according to formulas (1.7.3-12) and (1.7.3-17), with Pg now expressed 
by Eq. (1.7.3-19). 

Substituting the proper system and task-traffic parameters, as 
well as the parameters characterizing the mission phase under 
consideration, the above formulas allow us to compute the system 
performance indices, related to the service of cyclic tasks 
1.7.4 Queueing Analysis for Cyclic Tasks: Model II 

To derive at a more detailed GPC queueing model, in describing 
the service of cyclic tasks, we can use the models described in 
Sections 1.6.7 and L6.8. Using these models, we can describe the 
CPU/ 10 processing interactions in the GPC system. 

Assume thus the GPC service system to be described by the 
closed loop model illustrated by Fig. 1.6.6. 

We assume that the number of cyclic tasks in the GPC is kept 
constant, equal to Ng, as in the previous section. 

A task entering the GPC system joins the CPU queue. A task can 
enter GPC service only when a previous one has been completed, 
assuming thus a constant number of Ng cyclic tasks in the system. 

After receiving service by the CPU a cyclic task leaves the system 
with probability a . With probability 1-a this task will subsequently 
enter the I/O queue. There, tasks are served on a first-come first- 
served basis. Upon departure from the I/O processor, a task joins 
immediately the CPU queue. 


We assume each cyclic task to require CPU and I/O processing 
times which are i.i.d. exponentially distributed random variables 
with means 
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Avg. CPU service time required by a cyclic task 
= Vq^(C) [sec] 

Avg, 1/0 service time required by a cyclic task 
« Ul^(C) [sec] 

We obtain that 

Avg. number of times that a cyclic task uses the 

«C 

I/O processor = ^ — ‘ 


(I. 7. 4-1) 


(I. 7. 4-2) 


(I. 7. 4-3) 


Also, 


Avg. total CPU processing time required by a cyclic task 
« [aQUQ(C)]"^ [sec] ; 


(L7.4-4) 


Avg. total I/O processing time required by a cyclic task 
We define the queue size probabilities 


(1.7, 4-5) 


Ppj =* P{n cyclic tasks in the CPU, queued or being served). (1.7. 4-6) 

To perform the queueing analysis we note again that the service 
of a cyclic task by the GPC system proceeds in an interrupted periodic 
manner. We perform an approximate queueing analysis by setting the 
effective meein CPU and I/O processing times ^ denoted as yj>\(C) and 




vj (C), respectively, to 


be 



- wchc) 

t , 

(I. 7.4-7) 

p/ (0 

= k/(C) 

. 
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/ 

(1.7, .4-8) 
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following the same approximation adopted in the previous section. The 
following analysis results are obtained (see also Section 1.6.7). 

The system traffic intensity parameter p is set equal to 

l>l(C) 

We obtain the queue-size probability to be given by 

^ ^ " 0,1,..., Nj. . (1. 7.' 

The average response time (time delay) of a cyclic task in 
the system is obtained to be expressed as 

The average task waiting time Wq is 


W = D 




c 


The CPU index of utilization is equal to 

Ucpj(C) = CPU index of utilization by cyclic tasks 

= P{CPU is occupied by cyclic tasks during the 
cyclic period) 


We note that the cyclic task response time and the CPU 
index of utilization UpojTC) are related by the formula 














Thus, we can use these formulas to compute, for each mission 
phase, the relevant performance indices. 

1.7.5 Queueing Analysis for Cyclic Tasks: Model III 

Model III for the GPC service system is chosen to be the model 
described in section 1.6.8 (see Fig. 1.6.7). See Section 1.6.8 for 
detailed description and derivations. All the system parameters 
used are denoted as in this section, with the following modifications. 


We consider here only cyclic tasks. Subseqeuntly , parameters 
are denoted as: T(,(C), y’^C), y'\c), X~\ Tj(C),K(C). 

We set M=N^ to denote the maximal number of cyclic tasks in 
the main processing system. 

We set N=N(C) to denote the number of cyclic terminals (sources, 
users). The average terminal thinking time is . 

Rederiving the delay-throughput expressions for the present case, 
we obtain the cyclic task response time Dq to be given as a function 
of the CPU index of utilization by cyclic tasks during the cyclic 


period, Uf>p|i(C), 


NcTc(C)Tf ^ 1 

“u^^ljTcTTc c 


[sec] 


The throughput is given by 




[interactions/sec] 


For exponentially distributed CPU and I/O service times, we 
obtain the CPU index of utilization to be equal to 








Using these formulas, one computes the system indices of 
performance when considering the service of cyclic tasks, under 


various mission conditions. 

Using Eq. (I. 7. 5-3), one computes the CPU index of utilization 
U^pu(C). The latter indicates the fraction of the cyclic frame 
period that is occupied by the service of cyclic tasks. The 
parameters involved in this computation are: 

Nq = maximal number of cyclic tasks served during a single 
cyclic frame period 

y^^(C) - average CPU time for cyclic tasks between I/O 
operations, during the cyclic period 

Mj^C) = average service time for an I/O cyclic request, 
during cyclic period 

The computer system throughput for cyclic tasks, TH^, is evaluated 
by using Eq. (I. 7. 5-2). It yields the average number of cyclic 
task completions per unit time within the cyclic frame periods. 

Finally, the response-time (average time delay) of a cyclic task, 
Dj,, is computed by using Eq. (I.7.5-1). It yields the average 
time delay of a cyclic task, from the instant it indicates its task 
request to the instant its service is completed. 

The additional parameters involved in Eq. (I.7.5-1 )-(I.7.5-2) 

are: 


T-(C) = average total CPU time required by a cyclic task 







1 
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= average think time between initiation of a new cyclic 
task and the completion of the previous one 

j 

= duration of a cyclic frame period 
Tp = duration of a frame period (main system cycle) 

1 . 7.6 Queueing Analysis for Acyclic Tasks: Priority Model I 

: We consider now the service of acyclic tasks. Requests for 

service by such tasks arrive at random, according to the 

i ■ .... 

statistics of a Poisson process with intensity acyclic tasks/sec. 

Thus: 

1 

: Average number of new acyclic tasks (requests for service) 

arrivals = acyclic tasks/sec . (1.7. 6-1) 

The computer system can serve acyclic tasks only during the 
acyclic frame periods (see Section 1.7.2). Therefore, acyclic tasks 
are served by the GPC during their period for a length of time of 
sec; then, service is interrupted for Tp-T^ sec; subsequently, 
service resumes for another T^ = sec, and so on. 

We wish to describe here a simple queueing model for the GPC 
service system, which incorporates different task priorities (see 

i:."- . 

Section 1.5). We consider a generalization of the priority queueing 
model described and analyzed in Section 1.5.4, 

i . ; ' ■ . 

Acyclic tasks are classified into p priority classes. A 
class-k task is a task with priority number k, k = 1,2, — ,p. 

I Class-1 tasks attain the highest priority, while class-p tasks 
; have the lowest priority. 

i Under a nonpreemptive service discipline, when computer service 
time becomes available, a class-i task is served before any class-j 






task if i < j. Within each class, tasks are served in order of 
arrival. No preemption (interruption) of any task service is 

allowed. 

Under a preemptive resume service discipline, class-i tasks 
afe again preefered over class-j tasks if i < j, as above. However, 
now we allow the preemption (interruption of service) of a lower 
priority task when a higher priority task arrives at the system. 

We assume that class-k acyclic tasks arrive at the system 
according to a Poisson process with intensity A^(k) tasks/sec: 

Intensity of arrival of priori ty-k acyclic tasks 

= A^(k) tasks/sec, k = l,2,...,p; (I. 7. 6-2) 

so that 

P 

• {1. 7. 6-3) 

* 

We set 

Spik) = GPC processing time required by a class k acyclic task 
The corresponding required service time moments are 

S^(k) = E{S^(k)} = mean service time for priority-k task; (I. 7. 6-4) 

S^{k) = E{s2(k)} . (I. 7. 6-5) 

In particular, we note that if an acyclic class-k task required GPC 
service time is exponentially distributed with mean y^^(k), then 

S^(k) => y*'(k) . s|(k) > y’^(k) . ( 1 . 7. 6-6) 

On the other hand, if each k-class task has a fixed service 




cjCinC^c 


om 


requirement, S^(k) = (k), then 


S^(k) = y'^(k), S^(k) ••= y'^(k) 


(I. 7.6-7) 


We set 




(I. 7. 6-8) 


w 

E 01 ^ 

i=l 


(I. 7. 6-9) 


“ “p ' E 0,- 

i=l 


(1.7.6-10) 


For queue-size stability (so that queue-sizes and message delay 


would not become arbitrarily high) we requrie 


p < 1 . 


(1.7,6-11) 


We set 


W^(k) = average waiting-time for a priority-k 

acyclic task 


D^(k) = average time delay (response time) for 

a priority-k acyclic task 


X^(k) = average queue-size of priority-k acyclic tasks 
= average queue-size of all acyclic tasks. 


Assume first a nonpreemptive service discipline. The same 


approximation for describing the service interruption used in 
previous sections is employed. We obtain the following formulas, 


w^(k) = 




, k = l,2,...,p , (1.7.6-12) 


jCi 




1 
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where 


^ ^ ’ 


(1.7.6-13) 


D^(k) = W^(k) + S^{k) ; 


(1.7.6-14) 


The system index of utilization is: 


= index of utilization of GPC by acyclic tasks 

= P{GPC is occupied by acyclic tasks during the 
acyclic frame periods} . 


It is given by 


i=l ^ 


(1.7.6-15) 


If a preempt! ve- resume service discipline is assumed, we obtain 
the following formulas. 


D,(k) = 




(1.7.6-16) 


Bk = 2(l-a,)VK) ^At^>(TrY ; (1.7.6-17) 

C i=l ' ^ / 


X^{k) = X^(k)D^(k) 


(1.7.6-18) 


'a = E 'a(') 

k=l 


(1.7.6-19) 
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Thus, under a preemptive resume priority service discipline 
the message response time is given by Eqs. (I. 7.6-16)-(I. 7.6-17), 
while the queue-size are given by Eqs. (I.7.6-18)-(I.7.6-19). We 
note that the required average buffer sizes are estimated by the 
queue-size values of (I.7.6-18)-(I.7.6-19). The computer index 
of the utilization is expressed again by Eq. (1.7.6-15). 

These formulas, and their extensions, as outlined in this 
report, allow us to analyze the computer system performance 
under the proper mission conditions. 

We have demonstrated here the use of a simple priority queueing 
model. Other priority queueing models have been presented and 
analyzed in Sections 1.4 and 1.5. A multitude of time-sharing 
queueing models are presented in Section 1.4. Various priority 
queueing models are discussed and investigated in Section 1.5. The 
results presented there are directly applicable to the queueing modeling 
and analysis of the Space Shuttle avionics computer system studied 
here. The only modification necessary, when considering acyclic 
tasks, is the incorporation of an effective required service time 

Tp 

equal to S.(k) . 

. ‘a 

In this way, the proper traffic, task and subsystem models 
and parameters, presented in Section 1.2, are used to evaluate 
the computer system performance measures presented in Section 1.3. 

The results of Sections I. 4-1. 5 are properly integrated. 

1.7.7 Queueing Analysis for Acyclic Tasks: Models II 

Queueing models describing the service of acyclic tasks, 
while detailing the CPU/IO interactions are developed and studied 
in a manner which is completely analogous to those presented in 
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Sections I. 7. 4-1. 7. 5. The only differences lie in: 

♦Choosing system service and arrival parameters for acyclic 
tasks, rather than cyclic tasks; 

♦Choosing the proper number N (rather than N^) for the maximal 
number of tasks allowed simultaneously to be in 6PC 
service; 

♦Replacing Tp/T^ by Tp/T^, 

Otherwise, we obtain the same relationships for the computer 
system indices of performance. 

1.7,8 Joint Queueing Analysis 

The results in Sections I. 7, 3-1, 7. 7 are combined as follows to 
yield the indices of performance for the global computer system. 

The response-time (average message delay) of a cyclic and 
an acyclic task, is given by and D^, respectively. If priori ty-k 
tasks are considered, the corresponding response times are Dj,(k) and 
D^(k). The proper formulas are given in Sections I. 7. 3- I. 7. 7, The 
time-frame division between dedicated, cyclic and acyclic periods 
has been exposed in Section 1.7.2. 

The traffic intensities of dedicated, cyclic and acyclic tasks 
are denoted as and respectively . Then, if we choose a 

certain task at random, its average queueing delay (response-time) 

D will be equal to 

D = A ^{AqDj^ + AqD^ + A^D^} , (1. 7. 8-1) 

where 

A = Ap + x^ + A^ . (I, 7. 8-2) 

The function D|^ denotes the average delay of a dedicated task. 
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For such a task we have presently reserved computer time. We 


can thus assume its waiting time to be equal to 0, and set its 


response time equal to its average required service duration. We 


Sp = average required computer service time for a dedicated 
task. 


Subsequently, the dedicated task response-time is equal to 


D = S — 
D D Tj, 


In computing the computer system queue-sizes, we write 


X = Xp + + x^ . 


where 


X = global system average queue-size; 


Xp = average queue-size of dedicated tasks, 
X^ = average queue-size of acyclic tasks, 

Xq = average queue-size of cyclic tasks. 


If we assume that presently no dedicated tasks are waiting, as noted 


above, then the queue-size is equal to the number of dedicated 


tasks presently being in service. 


The GPC index of utilization U is computed as follows. We 


Uq = GPC index of utilization for cyclic tasks in the cyclic 
periods, 


- GPC index of utilization for acyclic tasks in acyclic 
periods. 


Up = GPC index of utilization for dedicated tasks. 
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The indices and have been computed in Sections 1.7.3 - 1.7.7. 
The index Up is set to be 


Up = fraction of time that the dedicated frame period 

(of length Tp) is used. (I. 7. 8-5) 

Function Up is determined by the state of the mission as pertaining 
to how much dedicated service is presently required. 

The GPC index of utilization U, in serving all these three 
classes of tasks, is given by 

U = fraction of time the GPC is idle 

= P{GPC not occupied in serving any dedicated, or cyclic, 
or acyclic task}. (I. 7. 8-6) 


We conclude that 

U = 1 - (1-Up)(l-Up)(l-U^) . (I. 7. 8-7) 

Using the index utilization formulas presented in previous sections, 
we can determine the time frame values Tp, T^, Tp, that will yield 
the proper desired high (and even maximum) system utilization values, 
under proper task response-time and queue-size constraints. The 
system designer and analyst can thus deduce, adjust and plan the 
proper compromised system performance values. 

1.7.9 Queueing Analysis for User Terminals; Output Traffic 
The queue-size behavior of a user terminal is described by 
the following model. 


We describe the process of transmission of requests or 
messages from a user terminal to the computer complex by a c yclic 
polling TDM (time-division multiplexing) procedure. For that purpose 




we divide messages into fixed-length data units called packets . 
A packet will contain an average of ]x~^ bits: 


Average packet length = bits . (I. 7. 9-1) 

A packet can contain request for service information or any data 
information transmitted to the computer system. 

Data is transmitted across the data-bus network at a rate 
of C bps; 

Transmission rate = C bps (I. 7. 9-2) 

For the avionics network, we have 

C = 10^ bps. 

Subsequently, the packet transmission time is equal to x sec, where 

t = (yC)'^ [sec] . (I. 7. 9-3) 

Assume now that we establish a basic time slot duration sec, 
so that the terminal under consideration is polled as follows. It 
is assigned, on a fixed TDM basis, a single slot for information 
transmission, once every M slots. Thus, the terminal can transmit 
a packet of information in its assigned slot of x sec duration; 
subsequently, it has to wait (N-l)x sec for its next assigned 

-s, . . 

slot to occur, and so on (see Fig. 1.7.2). 

M-1 Slots M-l> ^ 

— ^ — 


Figure 1.7.2 



Assume now that the terminal generates packets (of service 
requests or applications data) according to a Poisson process with 
intensity packets/sec. Thus 

Packet intensity at a terminal = packets/sec (1.7. 9-4) 

Let the terminal indices of performance be given by: 

= average delay (response-time) for a packet at a user 

terminal [sec], (I. 7. 9-5) 

Xp = average queue-size (in packets) at user buffer, (I. 7. 9-6) 

Up = index of utilization of a user terminal buffer. (1. 7. 9-7) 

We proceed with a TDMA queueing theoretical analysis and obtain 
the following results for the terminal performance functions. 

i M + 1 + ^ , (1. 7. 9-8) 

p 2 1-p 

where 


p = Mx < 1 ; 
P 


" ^P^P ’ 


U^ = p = MX . 
P P 


(I. 7. 9-9) 

(1.7.9-10) 

(1.7.9-11) 


Thus, in observing the queueing characteristics of user 
transmissions and its buffer, as reflected by eqs. (I.7.9-9)-(I.7.9-ll) 
we deduce the following conclusions. The packet delay and buffer 


queue-size increases rapidly as Up approaches its maximal allowable 


value of 1. Fixing an average allowable queue size value Xp, to 


yield an acceptable probability of overflow (POF) value, results 
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by (1.7.9-10) with a delay value D = assume an 

r r r 

input rate equal to X . The delay function D is related to U = p 

P P P 

= MXp and M by Eq. (I. 7. 9-8). We subsequently solve for the 
associated value of M. The latter specifies the required frequency 
of polling (equal to ■^) for this terminal. 

We have presented here a model that can apply to the multitude 
of terminals, users, subsystems and application processes in the 
Space Shuttle avionics system. Time-sharing and priority queueing 
models, presented in the previous sections can also be applied. 

1.7.10 Queueing Analysis for User Terminals: Input Traffic 

We consider in this section the terminal buffer queue-size 
behavior in terms of messages arriving at the terminal from the 
computer system. 

Consider a specific terminal where messages arrive from the 
computer complex according to a Poisson process with a rate of 

= message arrival ral“e’'-ar a terminal [mess, /sec] (1.7.10-1) 

Each message is assumed to contain bits/mess. Thus: 

= E(S^) = mean terminal message length [bits/sec] 

(1.7.10-2) 

= E(S^) (1.7.10-3) 

The terminal is assumed to process (and absorb) the received 
information at a rate of 

= terminal processing rate [bps] . (1.7.10-4) 

Subsequently, each message requires a terminal processing time of 

[sec/mess.] . 
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The performance indices of interest are; 


= average message queue-size in terminal buffer 
= average delay of a message in terminal buffer 
= index of utilizationof terminal buffer. 

Regarding the terminal service system in processing input data 
from the computer as a single-server queueing system, we obtain the 


following results (see Section 1.4.3) 

n - t^t^t 


+ s c~^ 


p = ^ ’ 

xVc-2 

^t “ ~ "2(1 -p) 


+ p ; 


Ut = P = 


If message lengths are exponentially distributed with mean 
[bits/mess.], we have 

T - "V-l . "c2 - -2p-2 /T 7 

St - y^ y^ . (1.7. 

For exponentially distributed message lengths (1.7.10-9), we can 
also derive the performance measure while assuming a buffer with 
finite capacity of 

L^. = terminal buffer capacity (in number of messages). 

t / T -» 


Using this the results presented in Section 1.6.4, we conclude 
the following expressions. 
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1 


where 


L +1 

1-p ^ n=0 


np 


(i.7.io-n) 




(1.7.10-12) 


where 


1 

1-p 

-1 

-1 

't 1-Pr 


= 0-p) 


1-p 





(1.7.10-13) 


(1.7.10-14) 


An additional important measure of performance is now expressed 
by the probability that the buffer is saturated (overflow), POF^. 

This is also equal to the probability that an arriving message is 
rejected (not accepted) at the terminal, denoted as Pj^, due to buffer 
flow. We have: 


POF^ = probability of terminal buffer overflow 
= Pp = probabil ity of message rejection 


t 

= (1-p) (1.7.10-15) 

1- P ^ 

Thus, in designing and analyzing the terminal system we specify 
and compute the delay, utilization and POF measures, using the 
performance formulas given above. Other time-sharing and priority 
queueing models can be applied and analyzed, following the presentations 
and results presented in the previous sections. 





cJlinCLom 


1.8 SYNCHRONIZATION METHODS FOR THE DATA PROCESSING SYSTEM 

1.8.1 Synchronization Considerations for the Data Processing System 

Due to the distributed control of redundant sensors among the 
Space Shuttle avionics network computers, an unacceptable time skew 
can exist between redundant inputs unless the GPCs are synchronized 
prior to initiating the inputs. Similarly, unacceptable data-skew 
may exist at the voting effectors unless a synchronization procedure 
is employed prior to initiating outputs. In addition, unacceptable 
command differences may exist at the voting effectors unless 
synchronization occurs at proper states during program execution. 

Synchronizationis accomplished in the Space Shuttle computer 
complex by using inter-computer discrete signals and synchronization 
software. 

Program synchronization is required as well, since computers 
that do not use exactly the same data for computing flight-control 
outputs experience command divergence effects. The time required to 
synchronize program execution depends on the design of the flight 
software operating system. A fixed time-slice system (in which 
all processes are run within a given cycle time) requires a single 
synchronization point in each computational cycle. An interrupt- 
driven system must synchronize at all points at which data are 
calculated in one process and used in another, and at all points 
needed to preserve identical process sequences in all computers 
of the set. 


Synchronization requirements between the GPCs also arise due 
to error detection and recovery objectives. To provide a smooth 
switchover in the case of a failure, the computers must possess 
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some degree of synchronization even if the synchronization 
implementation uses only the intercomputer communication lines. 

To achieve a high degree of error detection, comparison and 
voting procedures need to be employed. This requires the outputs 
of the GPCs feeding the comparison/ voting stage to be synchronized. 

•A software initiated synchronization is performed before: 

• Input commands are issued; 

• Outputs are exchanged for comparison purposes; 

• Compool is updated by the background to pass information to 
the foreground; 

<Real time is obtained. 

A list of all active output must be maintained, for comparison 
or voting purposes. For a "bit-by-bit" comparison shceme to perform 
satisfactorily, all inputs to the computers must be identical. These 
include sensor inputs, crew inputs and real time. The FCOS must 
guarantee the proper synchronization to maintain identical inputs. 

For example: 

•Sensor and crew inputs must be commanded only after a proper 
synchronization sequence; 

•If all machines possess independent real time clocks, then 
when real time is desired, the machines must synchronize, 
exchange real time, and utilize a properly defined average 
value to be used in navigation and control loop calculations. 

To keep the GPCs in synchronization the following functions 
are employed. 

a) Synchronization points are specified. For example, the following 
synchronization points can be chosen. 
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•Sync upon entrance to a foreground routine; 

•Sync before a data input sequence; 

•Sync before a comparison and output cycle; 

#Sync upon entrance to a trap routine; 

#Sync upon entrance to (or exit from) a background/foreground 
update block; 

•Sync before the real time clock is read and exchanged as data; 

. *Sync upon entrance to the interrupt service routine; etc. 

b) A maximum time-out function is specified. This function 
represents the maximum waiting time allowed for the machines to 
synchronize. Different sync points can possess different 
time-out limits. 

c) A topological sync-connection function. This function designates 
the SPCs with which synchronizationis to occur at the underlying 
point. 

In the Space Shuttle orbiter avionics system a GPC software 
synchronization technique is thus incorporated into the software 
system to support simultaneous operation of GPCs in a Redundant Set. 

It alos supports all active GPCs for System Software Interface 
Processing. 

The following software requirements are associated with the 
synchronization procedure: 

a) The synchronization technique is required to meet time skew 
constraints , for sampling data sensors and providing output 
commands to the external voters. 

Allowable time skew on inputs is bounded by a specified value 


denoted as ATj. Typically, 



I * oCinC^c 


om 


aTj ^ 450 ysec 

Allowable time skew on output commands is bounded by a specified 
value denoted as ATg. Typically, 

aT„ 1 msec 

The input time skew is defined as the time span between the first 
and last input command to the buses of a redundant sensor set to 
achieve the effect of a -simultaneous read o peration. Additional 
time skews need to be incorporated to account for differences in 
bus transmission times and sensor response times . 

The output time skew is defined as the time difference involved 
in the issuance of redundant output commands to the buses. Additional 
time skews need to be incorporated to account for hardware related 
time differences. 

b) The synchronization technique needs to support the fault 
detection and identification software function. This involves 
GPC self-test procedures in the simplex mode and additional 
bit-by-bit comparisons of specified output commands in the 
redundant mode. 

c) The synchronization technique needs to support synchronization 
of all active GPCs (common sync points) to facilitate system 
software interface processing. 

In particular, SSIP processes are those required to run at 
the same time in all active GPCs, regardless of the major function 
they support. For example, the following functions are elements of 
SSIP processing. (These elements may employ various sync points.) 

A. Intercomputer communications (ICC). 
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B. Time management-required for the input coordination function 
on the reading of the MTU and passing GPC prime clock values. 

C. Downlist control to insure a phase relationship of the downlist 
program. 

D. Configuration change coordination - Required for switch and 
keyboard inputs that require coordinated configuration changes. 

E. Systems status data for display and control - There are 
numerous parameters in the system software that are required 
to be available for display across all GPCs. There also are 
various logic control parameters denoting systems software 
status required to be passed among all GPCs (for example, 
what GPCs contain which memory configuration. 

F. Applications interfact - Involves the trading of data between 
dissimilar GPCs to support integrated displays and special 
interfaces. 

G. Launch Data Bus control - Involves changing command configuration 
of the two LDBs when a request to transfer control is received. 

H. GPC initialization - Requires ICC to establish the current 
configurations of other active GPCs. 

I. Annuciation - Common for all memory configurations and 
required ICC coordination to facilitate GPC control of the PL 
and DEU buses to output all C&W and alert messages and to 
combine identical messages produced by a Redundant Set into 
single messages. 

J. GPC error handling - System error responses may require 
GPC coordination to determine what logic to invoke (for 
example, to avoid downmoding all GPCs in a redundant set 
for common mode errors). 
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K. Mass Memory contention coordination - Involves coordination 
between GPCs when different configuration require use of a 
shared Mass Memory Unit. 

1.8.2 A Queueing Model 

We present a general queueing model to describe message delays 
and buffer behavior under a synchronization operation. 

The unit under consideration need to synchronize a process 
(being an output or input process) with another process. The 
other process can be associated with another unit, or be the average 
process generated from processes associated with a set of network 
units. 

Sync points are determined for the time comparison' of the 
two processes. To model this comparison operation, we assume that 
underlying messages need to be stored in the unit buffer and queued 
for a certain time until a time comparison task is completed. 

The period of time required for such a message to be 
queued in the buffer, denoted as S, can be simply represented 
by the formula 

-1 

S = + aT^ + aTjj + ATp , (1. 8. 2-1) 

where 

ji”"* = average sync message length [bits]; 

C = unit processing rate [bits/sec]; 

AT^ = time-skew due to clock differences; 

aTq = time-skew due to differences in propagation delays 

ATp = time-skew due to hardware processing differences. 

If sync points are determined in such a manner that sync 


messages arrive as a Poisson process at a rate 
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= arrival rate of sync messages [bits/sec]. 


then the unit system under consideration can be considered as a 
queueing system. 

In particular, applying the queue-size and message delay results 
presented in previous secitons we obtain the following formulas. 

The system traffic intensity is given by 

p = XS . (I. 8, 2-2) 

We require 

P < 1 , (I. 8. 2-3) 

to ensure finite limiting queue-size and message delay values. 

Then, the mean buffer queue size X, representing the average 
number of messages in the system, queued in the buffer or under 
processing, is given by 

1 2 

X = P + f • (1. 8. 2-4) 

The mean delay (response-time) "D of a message, representing the amount 
of time the message has to spend in the buffer for both queueing and 
processing purposes, is given by 

D = Sfl + ] . (1. 8. 2-5) 

Using these formulas, network constraints upon buffer queue- 
size and message delays can be applied to deduce the proper constraints 
upon the underlying time-skew functions. 

1.8.3 Clock Synchronization Procedures 


We consider the problem of time synchronization for the Shuttle 
computers, the data bus and other Shuttle systems using the time 




functions. 

The two main methods that can be applied to synchronize the 
GPC (or other unit oscillator) can be classified as: 

•Master-Slave Sync Techniques 
•Mutual Lock Synchronization Techniques 

Under the master-slave sync method, one oscillator is named 
the master and is the frequency reference. The other oscillators 
are synchronized to the master using phase lock loops. Failure 
of the master oscillator must be detected whereupon another 
oscillator is named the master. 

Successive master oscillators are selected in order from the 
surviving oscillators. There are two problems involved in this 
scheme: Since the entire system operation depends upon proper 

operation of the master oscillator, failure of the master oscillator 
must be detected and corrected. Two-failure tolerant failure 
detection is cumbersome. Also, the circuitry must be reconfigures 
to select a new oscillator to be the master from the remaining 
surviving oscillators. 

The mutual lock synchronization scheme works as follows. Each 
oscillator is controlled by a filter, in this case a phase lock loop, 
The outputs of all four oscillators are added together and applied 
to the inputs of each phase lock loop. The phase detector at each 
phase lock loop input determines the relative phase between a 
particular oscillator and each component of the summed input. 


For example, if the oscillator outputs are considered to be 
sinusoids, the summed outputs will be 
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4 

6 q = , 2 A. sin(o)C.t + 4,.} 
i=l 

where the 4>^'s are measured with respect to some arbitrary but 

th 

consistent reference. Now the phase detector measures the 

th 

phase difference between the oscillator and each of the i 
components, and it outputs the sum of these phase differences. 

Thus, the j detector output is 

4 

♦j “ E 
1=1 

where j takes on the values 1,2, 3, 4. 

It can be shown that, as a result of this summing of phase 
error at each input, the several oscillators will achieve mutual 
synchronization with normal loop dynamics. This is true provided 
the center frequencies are within a mutual "pull range" to begin 
with. 

Therein lies the key to failure safe operations for the mutual 
lock method. The tracking range of each oscillator is limited by 
clamps of the frequency control input of the oscillator. When an 
oscillator fails off frequency, loss of lock is assured by properly 
limiting the pull range. The failed oscillator will then be off 
frequency and will be properly ignored by the remaining phase lock 
loops due to the selection of phase lock loop bandwidth smaller than 
the failed frequency shift. The important point here is that failure 
of an oscillator does not cause detriment to the remaining oscillators 
because any oscillator introduces vital control into the loop only 
when a proper signal is present. 


inborn 


The oscinator used in the clocking circuit can fail in 
the following ways: 

•No output 
•Wrong output levels 
•Small frequency drift 
•Large frequency shift 

The first two failure modes can be easily detected by comparison 
of the performance of the quad computers and will not be detected in 
the clocking scheme proposed. The second two, however, can cause 
erroneous calculations of a more subtle nature and must be monitored 
and any failure rectified. 

A detector can be implemented to determine the frequency error 
between any oscillator and a reference oscillator. The difficulty 
here is that the reference oscillator may fail or the comparison 
circuit may fail. The failure modes of the reference oscillator 
are the smae as for an operational oscillator. The failure detector 
circuit (comparison circuit) on the other hand may fail in one of 
two ways: 1) it may erroneously indicate a failure of an oscillator 

(failure in the FAIL state) or 2) it may erroneously indicate that 
an oscillator is operational (failure in the GOOD state). 

Therefore, it is imperative in the oscillator failure detection 
scheme to provide that frequency error detection be done without 
introducing added failure modes. Oscillator frequency error can 
be determined in two ways. First, it can be deduced by comparing 
computer calculations using data derived from each reference 
oscillator. Secondly, oscillator failure can be determined by 
employing a double-fail -tolerant oscillator failure detector. 
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In order to use the master-slave synchronization technique, 
failure detection of the master oscillator must be done followed 
by an electronic reconfiguration to select a new master oscillator. 

In order to maximize hardware efficiency, failure detection may be 
done by a comparison between operational oscillators. Such a 
comparison between two socillators gives, not a positive indication 
of failure of either oscillator, but Isa failure syndrome indicator; 
the failure can be either of the oscillators or the failure detector. 
The; failure of a particular oscillator can be determined by taking a 
majority vote amongst several syndrome indicators, depending upon 
the number of failures to be tolerated. 

In turn, a clock system using a mutual failure detection 
principle can be used. Such a scheme is designed to guarantee positive 
failure indication of the five oscillators in spite of any three 
failures of oscillators or detection circuitry. Oscillator failure 
is announced any time two syndrome indicators go to the FAIL state. 
Those syndromes associated with the failed oscillator are then 
removed from service and no more comparisons accepted from them for 
additional failure indications. This requires memory of prior 
failures and also control functions between failure indication (FI) 
logic. Because of the need to have a three-failure tolerant failure 
detection scheme, the FI logic must be triple redundant with fail 
proof wired "OR" failure indication. 

The drawbacks in the mutual failure detecting clock system are 
as follows. The control exerted by one oscillator and its failure 
circuitry upon the others paves the way for catastrophic failure of 
one unit to destroy the others Therefore, when oscillator failure 
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detection is incorporated, the failure detection should be done 
on a basis wherein independence is maintained between the four 
clocking subsystems. In general, when a mutual synchronization 
procedure is employed the structure shown in Fig. 1.8.1 can be 
be employed. 


01 
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Figure 1.8.1 


Comparisons and fault-detection procedures are used upon the 
received processes (phases), in establishing the integrity of the 
underlying clocks. Subsequently, the healthy phase processes are 
summed to yield an average phase process. The latter feeds the 
phase-locked loop of the system (GPC) under consideration, as 
shown in Fig. 1.8.1 (for GPC number 1). 

The system analysis of such a loop is carried out in the 
following manner. Consider the PEL model for oscillator number 1 
shown in Fig. 1.8.2. 

Neglecting the VCO tuning voltage and VCD instability one can 
write the stochastic nonlinear differential equation by inspection 
as follows: 






y(t) - /2A Sin[jjr ti -9 -i-ep+eoie.] 

X] fiiM 




£>-^.'<,i:„ F,(P)I 


^ M 

o= 4a, 

r(t) - /2 cosCu^t ^• 4g] 



Figure 1,8.2 


-p- CF,(P) c] 


K F,(p) 

— p LSin (4^) + N(t, <i>^U 


where 


■*" °2 ®3 ®4 " ^0]) = phase error 

N(t, 4.^) = Cos Sin = equivalent phase noise 


We can write 

4 

♦1 = Oi - ^'> = 

'=1 


, K F,(p) 

(0, + O2 + 63 + 0 ^) [Sin + N(t, 4 ,,)] 


4- KF(p)M, 

♦l "i - p 1 N(t,^. ) 

1^1 . L i=l ' ' 


(I. 8.3-2) 


where , 


'I'l = (0^ + O2 + 03 + 04-40^) 

Similarly, the equations for other three loops may be written 





JUinCt 


om 


as follows: 


K F.(p) 

^■2 ' i ; ° i . “ —f— s 

i=l Li=i 


Sin (o. - 4 O 2 ) + N(t 




(I. 8. 3-3) 


where 


*^2 ^^1 ^ ^2 ^ ^3 ^4 ” ^'^ 2 ^ 


4 , ^ K F 3 (p) r 4 


* .Vo,. i 

3 1 p 


- Sin(o. - 403 ) + N(t, 4 - 3 )! ( 1 . 8 . 3-4) 

^1 = 1 


where 


'^3 ~ ^®1 ^2 ^ °3 ^ " ^®3^ 


♦4 = E “i 
1=1 


K P4(P) 


2 Sin(o. - 40^) + H(t, t^)l 
i=l 


where 


<}>4 = (0| + O2 + O3 + S4 - 40 ^). 


Equations (I. 8. 3-2) to (I. 8. 3-5) represent the system equations 
for four parallel coupled loops. Each' equation is a nonlinear 
stochastic differential equation with coupling introduced due to 
other phase lock loops. By assuming F.j(p) = p 2 (p) = f^ 3 ^P) “ f^ 4 (P) 
i.e., a first order loop and linearizing so that sin (ji ^ (|), the 
Fokker Planck technique of analysis can be applied to solve the 
simplified equations. 











For the two phase locked loop clocks the steady state equations 


W-^ = Wqi + K-j Cos (4'2 * ” *^1^ 


(1.8. 3-6) 


W 2 = + ^2 Cos (<!■, - 0,, - 4)o) 


1 "21 


(I. 8. 3-7) 


where = W-j = W 2 = the steady state output frequency of both 


oscillators. 


For a practical network let 


Wqi = Wq 2 and = K 2 = K. 


Then equation (I. 8. 3-6) and (I. 8. 3-7) gives 


0 - ” ^'^02 (*r ’2 ” *^1 ” ^12^ ” cos (6-j - ^>2 ” ^21^ 


*^^01 " ^'^02 


(®12 ^-l-'^Zl ) 


^''2 “ ^'^1 (^-*21 ' ^ 12 ) 


Let Sin J = A 


‘i*2 “ ^1 " 


and !21__!i2 = 0 


Substitute these in the above equation to give 


= Sin((|) + 0 ) 




For a practical case 


^^^0 - 0 and K » 1 . 

= -0 


Substitute this in equation (I. 8, 3-6) to get 


W] = + K Cos 





Thus in general 




= Wq. + K Cos 



(L8.3-8) 


These methods can now be integrated with the queueing 
techniques presented previously in this section and the reliability 
methods developed in the following sections, to obtain global system 
performance characteristics. 
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II. SYSTEM RELIABILITY MEASURES AND COMMUNICATION PATH 
FAILURE ANALYSIS 

II. 1 Reliability Features of the Data Processing Network 

The Space Shuttle orbiter avionics system is described in 
Section 1.1. In this section we will summarize the main system 
reliability features. 

The Space Shuttle avionics system contains five general 
purpose computer (GPCs) communicating with the avionic subsystem 

over a network of serial data buses (see Figs. I. 1 .1-1.1 .2). Four 

€ 

of the five GPcs are identically programmed to perform flight- 
critical functions, such as guidance, navigation and control. 

The fifth computer is programmed to perform non-fl ight-critical 
avionic functions. 

Subsystems that perform similar functions are assigned to 
the same data-bus group. There are seven such groups (Fig. 1.1.1). 
The subsystems have varying levels of redundancy at the unit level, 
depending on their criticality. To prevent the loss of more than 
one redundant unit when' one data bus fails, no two redundant units 
interface with the same bus. 

During time-critical mission phases (when recovery time is 
less than one second), such as boost, reentry and landing, four 
of the five GPCs operate as a redundant set, receiving the same 
input data, performing the same flight critical computations and 
transmitting the same output commands. In this mode of operation, 
efficient detection and identification of two flight critical 
computer failures is provided by comparing the output commands 
and " voting " on the results. This is called the voting subsystem . 
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After two failures, the remaining two computers in the set use 
comparison and self-test techniques to provide tolerance of a 
third failure. The voting mechanism thus allows a computer to 
transmit incorrect commands to critical subsystems for an 
indefinite number of cycles without having adverse effects on 
system operation. 

Each of the redundant subsystems is connected to a different 
bus. Thus, a different computer requests data from each of the 
subsystems and the returned data are available to all other 
computers in the set. 

In non-critical phases of the mission, each of the GPCs is 
associated with a proper dedicated subset of subsystems. This 
non-redundant configuratio nis termed the simplex mode. 

Topologically, we note that the data processing system is 
structuared around a central set of GPCs. Thelatter are inter- 
connected to the subsystems so that they can be operated in 
redundant groups to provide critical services. 

Interface adaptation between the data bus network and the 
orbiter subsystems is accomplished by multiplexer/demultiplexer 
(MDM) units. The GPC complex is interfaced with the data bus 
network through the set of I/O processors (TOPs). The serial 
digital data buses are time-shared, so that data transfer is 
carried on a time-division multiplexed (TDM) basis, using pulse 
codemodulation (PCM). 

Each GPC contains a self-testing program as well as built-in 
test equipment. The latter enables it to attain a 96% fault detection 
capability. 
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Each computer lOP interfaces with the other lOPs and with 
the interfacing subsystems over the 24 separate serial data 
buses. The lOP contains a set of 24 independent processors, 
called Bus Control Elements (BCEs). A 25th processor, the Master 
Sequence Controller (MSC) controls the operation of the 24 
BCEs. These 25 processors act as separate digital computers, 
with data processing programs independent of the CPU programs. 

Each BCE controls a Multiplexer Interface Adapter (MIA), which 
is connected to the serial data buses via bus couplers (see Fig. 1.1.3). 
The MIA transmits and receives information, encodes and decodes 
bus data, and tests for parity and proper synchronization of bits. 

In describing the reliability, fault detection and failure 

properties of the avionics data processing network, we will identify 

the relevant failure and reliability measures and models for: the 

computer system; the communication network; the subsystem complex; 

and the proper integrated interfaces among these subnetworks. 

II .2 Failure Parameters and Reliability Performance Measures for 
the Computer Complex 

In considering failures of system elements, we examine failures 
associated with the computer system, the communication network and 
the application subsystems. 

We first consider failures associated with the computer 
complex. A GPC is assumed to have an average failure rate equal 
to [failures/sec], so that 

A = average GPC failure rate [failures/sec] (II.2-1) 
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The period of time from initiation of operation to the failure 
of a 6PC, is called the PGC lifetime . It is a random variable, 

r 

denoted as T . Thus 
c 


T^ = 6PC lifetime = GPC operational time til 

failure [sec] (11.2-2) 

The mean duration E(T ) = T is equal to 

c c c 

E(Tc) = T^ = . (II. 2-3) 

To statistically characterize T^ we need to specify its 
distribution function F^(x), 

Fc(x) = P(T^<x) , X > 0 . (II. 2-4) 

It is many times assumed that T^ is exponentially distributed, 
so that 

-X X 

F^(x) = 1 - e ^ , X > 0. (II. 2-5) 

Other lifetime distributions are sometimes also used. For example, 
a useful two parameter lifetime distribution is the Gamma distribution 
with parameters A > 0 and k = 1,2,..., given by the density 


t*'') = df = Ti^- t > 0. 

(II. 2-6) 

Another useful life'Ti'me distribution is the Wei bull distribution 
with parameters v and k, v ^e, k >1, given as 

k 


F^(x) = 


1 - exp 
0 , 


-(«)!• 


X > e 


(II. 2-7) 


X < e 



"7 


The conditional failure r ate function h^(x), also called the 
hazard function, is given by 


' TTO- ■ (”-2-8) 

c 

The hazard function h (x) yields the density of computer failure 
after a lifetime of duration x, given that it has not failed during 
its first X units of time of operation. Thus: 


h^(x)dx = P{x < T £ x + dx|T > x} . (II. 2-9) 

For a Wei bull lifetime distribution, with parameters , v and k, 
we have 

=. I‘(^) ■ (II. 2-10) 

Thus, the chance of a GPC failure increases with time in accordance 

! 

with expression (II. 2-10). 

For an exponential lifetime distribution (II. 2-5) with parameter 

A , we obtain 
c 

h_(x) = A , for each x > 0 . (II. 2-11) 


Thus, under an exponential lifetime distribution, the conditional 
GPC failure rate is constant. The chance that a GPC will currently 
fail, given it has not yet failed, is independent of the length of 
time this GPC has been operational. The exponential distribution 
is therefore memoryless. A non-exponential distribution, such as the 
Gamma or Weibull distribution, should be used if the GPC conditional 
failure rate cannto be assumed to be constant. In general, the GPC 
conditional failure rate is a non-decreasing function of the past 
GPC lifetime. 

: : ~~ cjCinC^c 
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Considering a simplex operation of a GPC, self-test tests and 
programs are used to detect a computer failure. The probability 
of a computer failure detection, using only self-test techniques is 
called the computer coverage . Thus, we set 


Pj = GPC coverage probability 

= Pifailure detected by GPC self- test operation 

GPC failure occurred) . (II. 2-12) 

In the Space Shuttle avionics system, a goal of 96% coverage 
of computer failures has been set, when no external test equipment 
or cooperative use of other GPCs is employed. 

To obtain 

P^ = 0.96 , 

i 

all GPC self-test techniques are employed, including: built-in 

test equipment, timer micro and macro-coded self testing procedures. 

A storage of CPU 110 half-words and a CPU processing time of 1.3 msec 
is required. 

To attain a coverage of 

Pj = 0.88 , 

the above mentioned macro-coded self- testing procedure can be 
withdrawn. Then, a CPU storage of only 14 half-words and a GPU 
processing time of only 0.15 msec is required. 

It is worthwhile to achieve P^ = 0.96 prior to assigning a 
GPC to a redundant set. However, to save storage and processing 
time in using self-test procedures in the redundant set, during 
critical mission phases, it is sufficient to attain P^ = 0.88. 





The resulting redundant set reliability measure will be evaluated 
in a later section. 

The build-in test equipment by itself can yield = 0.37. It 
requires virtually no additional CPU storage and processing resources. 

It is also of interest for certain mission purposes, to model 
secondary GPC failures . These are failures that do not affect the 
operation of the GPC as related to the present mission. Given that 
a GPC failure has occurred, we let P^p be the probability that it is 
a secondary failure. Thus: 

P$F = PXfailure is secondary j GPC failure has occurred} . (I I. 2-1 3) 

Thus, we have 


= GPC secondary failure rate = P^pX^ , 
X(,p = GPC primary failure rate = (^"Psp^^c *' 


It is also possible to differentiate between transient and 
permanent GPC failures. A transient GPC failure will cause an 
incorrect computer output which can be restored within a relatively 
short period of time Typ. A much longer restoration time Tp^ 
is required to correct a permanent failure. The corresponding mean 


restoration times are 


^TR ^ ’ 


TpR = E[Tpp] . (II. 2-' 

Restoration times are sometimes assumed to be exponentially 
distributed, but any proper distribution (such as a Gamma distribution) 
can be assumed. For critical mission phases, we can set Tno 




cJCinC^c 


om 


In detecting computer failures, use is made of mutual tests 


and data interchange between GPCs, of inter-GPC comparisons, as 
well as of self- test procedures. We set 


Pjj(N) = probability of detecting a GPC failure, given it 


has occurred, when both self-test procedures and 
comparison procedurs among N GPCs are used. (I I. 2-1 6) 


Clearly, we have 


Pa = PhO) . 


(II.2-17) 


Pd^N) > Pd(N-l) , Pd(N) > Pd> N > 1 


(II. 2-18) 


In choosing reliability performance measures to assess the 


failui^e invulnerability of the Space Shuttle avionics computer 


complex, we consider the two computer system modes: 'the simplex 


mode and the redundant mode. 


In the simplex mode, an operating GPC serves a certain set of 


subsystems. To assess its operational reliability we define the 


following indices. 


Qcf(T) = Probability of a computer failure within 

T sec of operation, in simplex mode. (I I. 2-1 9) 


Lqp = mean time between GPC failures (MTBF), in 
simplex mode. 


If we incorporate computer restoration operations, then we 


are also interested in the following performance function: 


Qrn “ probability of a GPC being in a failure state, 


under restoration, in simplex mode 


cllinC^ 






In the simplex mode, when a computer fails, it can be replaced by 
another one. It is assumed that a minimum of two GPCs is required 
for regular operation. One is then interested in computing the 
simplex system loss probability : 


Qsj^(T) = the simplex system loss probability 

= probability that there are no two operational GPCs, 

in simplex mode, in T units of time. (I I. 2-22) 

We turn now to consider the redundant computer system mode . 

In this mode, 4 GPCs are operating in parallel, performing identical 
information processing operations. Comparisons are made between 
the computer outcomes. A voti nq procedure Is then employed. The 
failure of one or two GPCs is imnediately identified and GPC-located 
by the voting mechanism. The failure of a third GPC is indicated 

t 

by the voting procedure. However to detect which of the remaining 
two GPCs has failed, self- testing procedures are utilized. 

We assess the computer-complex reliability performance in the 
redundant mode by the following measures. 


Psl(T) = probability of a computer system loss during a T sec 


redundant computer system operation. 


(II. 2-23) 


The redundant computer system is said to be lost, during a mission 
phase of T sec .duration, if no GPC is remained operational. 

In assessing the increase in reliability contributed by the 
number of redundant parallel GPCs (denoted as N), we are also 
interested in computing the index; 


Pg^(T,N) = probabil ity of system loss during a T sec 
redundant operation of N GPCs 


(I I. 2-24) 


- 155 - 
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We note that in the present system, N = 4, so that 

Pci(T) = Pc, (T.4) . 


The following mean time between failures also provides a 
measure of redundant system invulnerability. 

Tp(N) = mean time to system failure of a redundant N-GPC 

computer complex . (1 1. 2-2 

In the present system N=4, so that we set 

Tp = Tp(4) . (II. 2-2 

II. 3 Failure Analysis for the Computer System: The Simplex Mode 

II. 3.1 Single GPC Failure Analysis 

In the simplex mode, each of the GPCs is associated with a 
proper dedicated subset of subsystems. < 

Assume that out of the N available GPCs, only M GPCs are used 
on a dedicated basis, M _< N. The remaining N-M GPCs are used to 
replace failing GPCs. 

Each GPC is governed by a failure rate (Assume only 
primary failures.) Using self-testing procedures, the probability 
of detecting a GPC failure, once it has failed, is equal to P^. 

The mean time to failure of a GPC is thus equal to 

Lcf = Tp(l) = mean time to failure for a simplex GPC 

= . (II. 3.1 

If the time to failure L^p of a single GPC is exponentially distributed 


we have 


P(L^P > t) = e 


<JHinC^L 
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t > 0 . 
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A time-dependent self-testing failure detection process is 
described as follows. We set 

Lpp = time to failure detection, by self-testing techniques, 

for a simplex GPC, given failure has occurred. (II. 3. 1-3) 

The mean time to failure detection Lp^ is equal to 


LpD = E[Lpp] = Xj’ , (II. 3. 1-4) 

provided failure detection occurs. If Lpp is exponentially 
distributed, we set 

-Xjt 

P(Lpo>t) = 0-Pd^ ^d® ’ ^ ^ (II. 3.1-5) 

Therefore, we conclude that 

Probability of a GPC undetected failure in t units of time 


= f P[L(,pe(u,u+du)]P(Lpp > t-u) 

‘'o 

= f* X^e ^'=‘'[Ppe' + 1-Pp]du 

•^0 

X^P. -x.t -x^t -x^t 

- d d c ^ ^ (l-P^)d-e = ) 


^c-^d 


(II. 3. 1-6) 


-X_t 


Thus, with probability 1-e a GPC will fail within t units of 
time. After failure, by self-testing techniques its failure will be 
detected with probability P^, and undetected probability l-P^- The 
dynamics of failure detection is described by Eq. (II. 3. 1-5). The latter 
yields the probability that failure detection (by self-testing) will 
require more than t units of time. Eq. (II. 3.1-6) describes the 
probability that a GPC failure will occur within t units of time 
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and that the failure will remain undetected during this period. 

II. 3. 2 Failure Analysis for the Simplex Computer System 

We consider the computer complex under the simplex mode. It 
is assumed that M GPCs need to be used on a regular basis, each being 
assigned a dedicated set of subsystems. The total number of available 
GPCs is equal to N, N > M. For the avionics system, we typically 
have N = 5, M = 2. The failure characteristics of each GPC haye 
been analyzed in the pn^'vious section. 

We assume now that upon the detectionof a failed GPC, it is 
immediately replaced by an in reserve GPC, if such is available. 

Initially, M GPCs are operating and N-M GPCs serve as reserve units. 

We say the system loss has occurred when no more than M-1 operating 
GPCs are left. Thus, we set 

Qg^{T,M) = P{no more than M-1 GPCs are left) . (II. 3. 2-1) 

For the avionics system, M=2, so that 

Qsl(T) = QslC^*2) . (II. 3. 2-2) 


We wish to compute Q£j^(T,M) and Q 5 |_(T). 

The GPC failure point process can be noted to be a Poisson 
process with rate Ma^ [fai lures/sec]. We subsequently obtain the 
following result. 

Qsl(T,M) = PCmore than N-M+1 computer failures in T units of time} 






Nii! -MAJ (mi)" 

= 1 . y- e — 


(n.3.2-3) 
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Therefore, for N=5,M-2, we obtain: 

3 -2> T (2aJ)'' 

QsJT) = 1 - 2 ^ 

n=0 

-2aJ j (2 a T)2 (2x T)^ 

= ! - « 1 ^ 


Using expression (II. 3. 2-4), we can thus compute the 
probability Q 5 L(i^) that the simplex system fails, so that no more than 
a single GPC is operating in T units of operation time. Alternatively, 
given a desired maximum simplex loss probability q^, we can evaluate 
the critical time such that 

= maxIT: 03 ^( 1 ) < Qq) • (II. 3.2- 


To compute T^, we solve 


Qsu(T,) = Q( 


1 1 . 3 . 3 Restoration Analysis for the Simplex Computer System 

We consider the simple computer system presented in the previous 
section, but now assume that failed GPCs can be restoaed. We assume 
the GPC restoration time to be exponentially distributed 

~Xpt 

P(T^ > t) = e , t > 0, (II.: 

with a mean restoration time equal to 


^ R ’ 


Computer time to failure is exponentially distributed with mean . 
Assume here that P^ = 1 . There are altogether N GPCs. Only M GPCs 
can be used simultaneously where M < N. 

.. 




To analyze the statistical characteristics of this computer 


system, we model it as a proper queueing network, shown in 
Fig. II. 3. 3.1. 



Figure II. 3. 3.1 

In this queueing network, no more than M GPCs can be used in 

parallel. Each will fail after an average operating time equal to 

. Upon its failure, a GPC is being restored. Average restoration 
c 

time is equal to . When a GPC is restored, it immediately joins 
the queue of reserve GPCs. Whenever the number of GPCs in service 
becomes below M, a reserve GPC (if available) enters service. 

To analyze this system, we use and extend the methods developed 
in Sections I. 6. 5-1. 6. 6. We set 


P^ = P{n GPCs are in the system, operating condition. 


in reserve or being used) . 


(II.3.3-3) 


We then obtain the following formulas 


M-1 


N 


/Nvf.k k . ^ Nip" 


M 


-1 


(II. 3. 3-4) 
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where 


and 


MX. 




(11.3,3-5) 


if k < M-1 


^0 ^ , if M < k < N 

0 , otherwise 

We can now set the probability of system loss for this 
model to be equal to the probability that the system contains no 
more than M-1 GPCs in working condition. We then obtain to 
be given by 

M-1 

^SL " 1] *'k ’ 

k=0 


(II. 3. 3-6) 


(II. 3. 3-7) 


where Pj^ is expressed by Eqs. (II.3.3-4)-(I.3.3-6). 

In this manner, the system engineer can compute the probability 
of computer system loss under a simplex mode of operation. The 
proper system parameters (such as GPC failure rates, restoration 
rates, number of reserve GPCs) can then be adjusted or chosen. 
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I I. 4 Failure Analysis for the Computer System: The Redundant Mode 

We compute in this section the underlying reliability perform- 
ance measures for the computer system under the redundant mode . 

In this mode, four GPCs are operating in parallel conducting identical 
operations. The outputs of these GPCs are compared and voting is 
used to decide upon the correct output. In this manner, one and 
two GPC failures are readily detected and the failed computer is 
identified. When only two operating GPCs are left, by comparing 
outputs one can detect the failure of a third computer. It, however, 
remains to identify the third failing GPC. Self-testing procedures 
are subsequently used. When only one GPC is left, only self-testing 
techniques can be used to detect its failure. The underlying 
reliability characteristics are then identical with those computed 
for the simplex mode in Section II. 3. 

To understand the performance dependence upon the number of 
parallel GPCs in the redundant mode, we assume that there are N 
parallel GPCs. In the avionics system under consideration, a 
number of N=4 parallel GPCs are employed. Thus, we set 


N = number of parallel GPCs in redundant mode. (I I. 4-1) 


Each GFC, using self-testing procedures and programs has a 
coverage probability P^. Thus, given a GPC has failed, it will 
detect its failure with probability P^, employing self-test 
techniques. 

The GPC failure rate is equal to 


GPC failure rate = . 


(II. 4-2) 
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We assume only primary failure here. Each GPC has a life-time 
(time to failure) described by a random variable (see Section II. 2). 
Note that 

• (II. 4-3) 

We initially assume that is exponentially distributed (Eq.(II.2-5)). 
Typical values for the avionics system are: 

Pj = 0.96 ; ■ 8 X 10'^ [failures/hour] . (II. 4-4) 

Our analysis is general, so that any proper parameter values can 
be incorporated. 

We first consider the probability measure P(j(N). It has been 
defined by Eq. (II.2rl6) as the probability of detecting a GPC 
failure, given it has occurred, when both self- test procedures and 
comparison procedures among N GPCs are used. In the redundant mode, 
we employ the comparison- voting procedure to detect and identify 
failed computers. Therefore, if i GPCs are operating in parallel 
with 1 ^3, we can always perfectly detect and identify any single 
GPC failures; so that 

Pd(i) = 1. if i = 3,4,..., N (II. 4-5) 

When only two GPCs are operating, we can still perfectly detect 
whether one of the GPCs has failed, so that 

Pj(2) = 1 (IT. 4-6) 

In this case, however, we need to employ self-test techniques to 
identify the failed computer. 

When a single operating GPC is left, only self-test techniques 
are used to identify its failures. Subsequently, we have 
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= Pd 


(II. 4-7) 


SO that the failure detection probability is equal to the 6PC 
coverage. 

To assess the reliability of the redundant computer complex, 
we are interested in computing the following two measures: 


Psj^(T,N) = the probability of system loss during a T sec (II.4-'S} 
redundant operation, starting with N parallel GPCs; 


Tp(N) = mean time to system failure for a redundant computer 

system, starting with N parallel GPCs. (II. 4-9) 


The function Pgj^(T,N) is computed as follows. We set 
fj^ 2 (t)dt = P{(N-2)nd GPC failure occurs in (t,t + dt) }. 


(II. 4-10) 


Thus, expresses the probability that, starting with N 

parallel GPCs, we are left at time t with only two operating GPCs, 
and the last failure occurred at time t, within (t-dt,t]. 

If every computer has an exponentially distributed lifetime, 
with mean x"\ and GPC lifetimes are statistically independent 

L» 

(as well as identically distributed), we obtain the following result. 

f|^ 2 ('t)clt = P((N“3) failures in (0,t)}P{a failure in (f-jt+dt)} 

M “X t o -X t -3X t 

= (n- 3>0-^ ) (<= ) dt . (II. 4-11) 


Therefore, 




(11.4-12) 


We also note that the times between the first N-2 failures are 


statistically described as follows. They are i.i.d, random variables 
such that the time between the i-th and (i+l)st GPC failures is 




exponentially distributed with mean C(N-i)x for i = 0,l,...,N-3. 

w 

We now observe the failure of the redundant computer system to 
proceed in two phases . In the first phase , starting with N parallel 
GPCs, N-2 GPCs fail. Using the comparison voting procedure, these 
failures are immediately perfectly detected and identified. We set 

Tp(N,l) = time duration of first failure phase 

= time until the (N-2)nd GPC failure . (II. 4-13) 

Then, Tp(N,l) is governed by the Gamma density (II. 4-12). In particular, 
the probability that phase one will be longer than t sec is given by 


P{Tp(N,l) > t> 


= f” 

i 2 


N! 




-3x_x 


F '"c (N-3)! 


(1-e ) e dx 


The mean duration of a phase one mode is given by 

N 


Tp(N,l) = E[Tp(N,D] = ' 

i=3 

In particular, for N=4 we have: 


-1 


(II. 4-14) 


(II. 4-15) 


-X^t -3x t 

f^j(t) - 12A^O-e '= )e ; (II. 4-16) 

Tp(4,l) = ^ X"' . (II. 4-17) 

Upon the termination of phase one, when we are left with only 
two operating GPCs, the phase-two failure mode starts (provided at 
this time, the computer system still operates in the redundant mode). 
Having now two operating GPCs, we are interested in computing the 
system loss probability P2|_(t,2). This is the probability that, 
starting with 2 PGCs, no operating GPC is left within t units of time. 

To derive this function, we write: 



' failures in (0,t)}+ P{a single 6PC fails 

in (0,t)}P{detecting a failureja GPC failure has occurred} 
-2x.t -X^t -x„t 

= e ^ + 2P^e ^ (1-e ^ ) . (II. 4-18; 

Therefore, we conclude that 


-X^t -2\ t 

P3l_(t,2) = 1 - 2P^e - e (l-2P<j) . 


We can now compute the system loss probability P 2 |_(T,N) as 

Psl(T,N) = r . (II. 4-20) 

0 

Substituting (II. 4-12) and (11,4-19) into (II. 4-20) we obtain the 
following result: 


1 A N! /I '^^c^ 

2 ^c (N-3)! ^ ® 


-X (T-t) -2x (T-t) 

[l-2P^e ^ -(l-2P^)e ^ ]dt . 


In particular, for the present avionics system we set N=4 
in (II. 4-21) and obtain, after some algebra, the following expression 
for the computer system loss probability. 

-2x T -2x T “5 

PsL(t) = = 1- e (3e ^ -8e ^ +6) - 4P^e ^ (1-e ^ T 

-X^T -x^T 

- (1-e ) [1+e ^ (3-4Pj)] . (II. 4-22) 

Eq. (II. 4-22) can also be derived simply as follows. We note that 
P$l(T) = P{4 GPCs fail in (0,T)} + P{3 GPCs fail in (0,T)>(1-P^) 

= (1-e ' )'* + (1-P^)4e ■= (1-e . 

Eqs. (II.4-22) and (II. 4-23) are identical. 


(II. 4-23) 







Extending the approach used to derive (II. 4-23), we obtain the 
loss probability Pg|_(T,N) when starting with N parallel GPGs, N^3, 
as follows. 

PsL^T.N) = P{N GPCs fail in (0,T)>+ P{N-1 GPCs fail in (0,T)}(1-P^) 
= (1-e + (l-Pj)Ne ^‘^'^(1-e 

= (1-e ^ )'^ '[1+e ^ (N-l-NPj)] . (II. 4-2- 

Eqs. (II.4-22)-(IL4-24) can now be used to compute the loss 
probability associated with the redundant computer complex. We 
note the following characteristics of Pg^d). (Similar properties 
hold for P 5 ^^(T,N), using (II. 4-24).) 

The loss probability P 5 l(T), given by (II. 4-22), is a linearly 
decreasing function of the coverage P^. This is illustrated by 
Fig. II. 4.1. 


Psl(t,p<,=0) 


PsL(T.Pd4> 






Psl(T.P,=1) 


Figure 1 1. 4.1 

If Pjj " 1 > so that we can detect with probability one a computer 
failure, when it has occurred, we obtain by (IT. 4-22) the loss 
probability to be equal to 





PsL(T.Pd=l) = 




(II. 4-25) 


This is the lowest attainable value for the loss probability. 

The highest loss probability is observed when P^'O- Then, 
the self-test techniques are inoperable (or useless), and we have 

Ps|_(T.Pd=0) = (1-e *‘‘=^)^(l+3e . (II.4-26) 

We note that Eq. (1 1. 4-26) incorporates the observation that if 
P^=0 and two GPCs are left, any GPC failure will result with a system 
loss condition. Hence, 3 or 4 GPC failures will result with system 
loss. In turn, Pj^T » when two GPCs are left, the system remains 
operational under a single GPC failure, and is lost onli when both 
GPCs fail. Hence, system loss now occurs only if all 4 GPCs fail 
yielding expression (II. 4-25). 

l 

For P^ = 2 > obtain 

PsL(T.Pd = l) = (1-e ^^^)^(l+e . (II. 4-27) 

For P^ = j, we also note that (see (I I. 4-1 6)) 

-X t 

PsL(t.2,Pd=^) = e (II. 4-28) 

Consider now the following procedure, to be called the random 
choice procedure. When two operational GPCs are left, if a failure 
is observed (through the comparison procedure), one GPC is arbitrarily 
(at random) shut down. Or, alternatively, when 2 GPCs are left, one 
GPC is arbitrarily shut down. Linder this procedure, the system 
loss probability, starting with 2 GPCs, P^^(t,2) is obtained to be 
given by 
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P5^(t,2) = e = P5^(t,2,Pj4> • ‘(II. 4-29) 

The associated system loss probability under a random choice procedure 

^ 1 

Psl(T,N), is thus equal to that obtained when t ’ 

'P5l(T,N) = Psl(T,N,P^4) • (II. 4-30) 


Therefore, 


Psl(T,N) < Psl(T,N,Pj) for P^ < ^ , (II.4-31a) 

?5L^T,N) > PsL(T,N,Pd) ^0^ Pd > • (II.4-31b) 

Thus, if Pj 4 the random choice policy is preferrable. Self- 
test techniques should not then be utilized, since they provide mis- 
leading failure information. On the other hand, if P^ > -^ , as is 
the case in the avionic system under consideration, a lower loss 
probability is attained when self-test techniques are utilized 
(since they then clearly provide additional helpful failure information). 

We now compute the mean duration of the phase-two failure 
period, denoted by Tp(N,2). Phase two starts with two operational 
GPCs. Let T^ ,T 2 denote the lifetimes of these GPCs. These are 
i.i.d. exponentially distributed random variables with means . 

The first PGC failure occurs at time min(T-j,T 2 ). Then, with probability 
1-Pjj the failure is not detected and the system is lost. With probability 
Pj, the failure is then detected and the remaining operational GPC 
continues to operate until it fails. Following these observations, 
the following result is obtained. 

Tp(N,2) = E{min(T^,T2)} + Pjx4 » 


where 
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. 3 . 



Subsequently, 


E{min(T^ ,12)} = j\ 


Tp(N.2) = ^ PjA-’ 

We again note that under the random choice policy the mean 
lifetime duration of phase-two, E(Tp(N,2)) is given by 


(II. 4-32) 


E[Tp(N,2)] = = Tp{N,2,P^4) ‘ (II. 4-33) 

Eqs. (II. 4-15), 

(II. 4-34) 
(II. 4-35) 


In particular, for N=4 the mean lifetimes are obtained by 
(II.4-34)-(II.4-35) to be equal to 


Tp(4) = 


(p + J-?. ) . 
^^d 12 ^ ’ 

(II. 4-36) 

E[Tp(4)] = ^ 

= 

1.583X’' = Tp(4,Pj4> • 

(I I. 4-37) 

For Pj = 1 , we obtain 




II 

r— 

II 

-0 

Q. 

25 

12 

Xl’ = 2.083X"’ . 

c c 

(II. 4-38) 


The functional dependence of the mean lifetime Tp(4) on the 
coverage probability P^, indicated by Eq. (1 1. 4-36), is illustrated 



The overall mean lifetime Tp(N) is obtained by using 
(1 1. 4-32), (1 1. 4-33), to be given by 


Tp(N) = X 




ECTp(N)] = Tp(N,Pj-i) . 

i=3 


c I 2^ 

Vi =3 

.-1 


7 


Mean Computer System Lifetime (N=4) 



Converge (P^) 


in Fig. II. 4. 2. 

1 

We now examine the dependence of the computer system 
reliability measures on the number N of parallel computers. 

The system loss probability P^lCTjN) is given by formula (11.4-24). 

If P^=l , we have 

-X T 

PsL(T,N,Pd=l) = (1-e ^ r . (II. 4-39) 


Therefore, for P^j^l > 


Psl(T,N+1,Pj=1) 

PsL‘T.N>Pd=') 



T 

) 


9 


(II. 4-40) 


so that by using an additional parallel PGC we decrease the loss 

-1 

probability by a factor of (1-e ) . 

The mean 1 ifetime Tp(N) when N parallel GPCs are used and 
P .=1 , is given by 
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(11.4-41) 


Therefore, 


Tp(N+l ,P^=1 ) 


3 N+1 , 


1=3 


Tp(N,Pj=l) 


i=3 


..-1 


(II. 4-42) 


Eq. (I I. 4-42) represents the factor by which the mean lifetime to 
failure of the redundant computer system is decreased, when the 
number of parallel GPCs is increased from N to N+1 . 

For example, if we use only N=3 parallel GPCs, rather than 
N=4 parallel GPCs, we obtain 

Tp(3,Pd=l ) _ 7 3 


22 


TF(4,Pd-l) I 


3 +T7I 


25 


= 0.88 


(II. 4-43) 


Thus, using 3 parallel PGCs, rather than 4, reduces the mean lifetime 
by a factor of 12%. 

If we, on the other hand, employ 5 parallel GPCs, rather than 4, 
we obtain 


. 137 ^ ,.095; (11.4-44) 


Tp(4.P^=l) 


T7T7T 

2 3 4 


125 


so that the mean lifetime is increased then by a factor of 9.5%. 


The system loss probability during an operational period of duration 


' -1 

T is then reduced, according to (1 1. 4-40) by a factor of (1-e ) 


The absolute value of the system loss probability is given by (II. 4-24). 

The equations derived above for the computer system loss 
probability and lifetime are in terms of the following parameters: 


n 






X = the P6C failure rate; T = duration of the redundant phase; 

^ • 

N = number of parallel GPCs; P^ = coverage probability. Eg. 
(II.4-24) yields the loss probability and Eq. (II. 4-34) the mean 
lifetime. The system designer and analyst can use these results to 
study or adjust the failure and characteristics of the redundant 
computer system. 
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I I. 5 Failure Analysis for an Application Subsystem 

We consider an application subsystem of the Space Shuttle 
avionics data processing network. The failure characteristics 
of this subsystem are examined in this section. 

The subsystem under consideration can be a telemetry sub- 
system supplying information data to the computer network at certain 
times; a sensor subsystem; actuator subsystem receiving commands 
from the computer complex; display subsystem; control subsystem; 
interface subsystem; GNC subsystem or the mass memory subsystem. 

An application subsystem is many times internally redundant. 
This is teh case for the hand controllers and the keyboard units. 
Also, all safety-of-flight critical effector subsystems, such as 
the actuators for the main engine and for the aerosurfaces , the 
main engine interface units and mission event controllers are 
internally redundant at different levels. Such subsystems receive 
redundant commands on separate input channels and using internal 
algorithms they generate a single output stream. These algorithms 
also detect incorrect commands and eliminate such commands from 
consideration in the output. 

Subsystems which perform similar functions are assigned to 
the same data-bus group. Subsystems have different levels of 
redundancy at the unit level. In accordance with their criticality 
For example, there are three inertial measurement units, two 
radar altimeters, and four air data transducer assemblies. To 
prevent the loss of more than one redundant unit when one data 
bus fails, no two redundant units interface with the same bus. 
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To analyze the failure characteristics (invulnerability) of a 
redundant subsystem, we set the following parameters. The subsystem 
under consideration is assumed to contain L equivalent redundant 
units . Each unit is assumed to be connected to a different bus,. 

Thus, 

L = number of redundant units in the subsystem 
= number of data buses connected to the subsystem (I I. 5-1) 


We characterize the failure properties of each unit by the 
unit failure rate A^, 

A^ = unit failure rate [failures/sec] . (II. 5-2) 

Thus, if T^ is a random variable representing the unit lifetime 
(i.e. , time duration to failure) , we have 

Average time to unit failure 

= E(Tjj) = A*’ . (II. 5-3) 

The’ unit lifetime distribution is specified as 

F^(x) = P(T^^<x), x>0 (II. 5-4) 

If the unit lifetime is assumed to be exponentially distributed, we 
have 

-Ax 

= 1 - e ^ , X > 0 . (II. 5-5) 

We assume unit lifetimes to be statistically independent and 
identically distributed. Furthermore, to explicitly illustrate the 
subsystem failure behavior, we assume now an exponential failure 
distribution (II. 5-5). (The following results, however, are readily 
extended to include an arbitrary unit lifetime distribution.) 



- 175 - 




We consider an operational period which lasts for T [sec]. 



Qy(T) = probability that all subsystem units fail 
in T units of time 

= [q,j(T)]'- " (1 - e )K (II. 5-7) 

Each unit is assumed to be connected to a different data bus 
To evaluate the probability of operational loss (or survival) for 
the subsystem, we now specify the failure characteristics of the 
data buses. 

Each data bus is associated with a random variable T representing 
its lifetimes (i.e., time to failure). Line failures can be defined 
to include both physical failures as well as interference (noise) 
phenomena which cause degradation in data communications across 
the line. We then set the line failure rate to be 


X = data bus (line) failure rate. ( 

The distribution of the line (data bus) lifetime is given by 


Note that 


•Fjj^(x) = P(Tl < X) , X > 0 . 


Data bus mean time to failure = E(T ) = x” 

A/ 


Assuming the data bus lifetime (time to "failure") to be exponentially 


distributed, we have 


F (x) = 


(II. 5-11) 


1 - e 


, X > 0 . 
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The invulnerability of the subsystem is expressed in terms of 
the following two measures. The subsystem loss probability is 
defined by 

PslCT) = probability of subsystem loss within T units of time 

= probability that within T units of time the subsystem 
fails or is disconnected from the bus network . (II. 5-1 2) 

The subsystem mean lifetime is defined as 

TgP = the subsystem mean time to failure or disconnection 

from the bus network (II. 5-13) 

The subsystem loss probability d^^tT) is computed as follows. 


del (T) = TT P{unit i is lost or disconnected} 


L 


P unit fails or its data bus fails 


TT [1-P(unit i does not fail, its data bus does 
i=l 

not fail )] 

L 

TT [1-P(unit i does not fail)P(data bus connected 

to unit i does not fail)] . (II. 5-14) 


Therefore, the subsystem loss probability is given by the formula 


= [1-e 




(11,5-15) 


The subsystem mean lifetime (time to failure) T^p is similarly 
derived to be given by 


i=l 


(II. 5-16) 


To derive equation (11,5-16), one notes that if i operating units 
are left, the time to the next failure (of a unit or its associated 
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data bus) is exponentially distributed with mean [i(A +x [sec]. 

E(is. {II.5-15)-(II.5-16) provide the desired formula for 
establishing the failure characteristics of the redundant subsystem. 
The parameters involved are: the operation period duration (T); 

the number of redundant units and data buses {L)i the failure rate of 
a unit (a ); and the failure rate of the data bus (A ). In terms 

U A» 

of these parameters, Eq. (1.5-15) yields the probability of 
subsystem loss (so that no connected operating unit is left), 
while Eq. (1.5-15) expresses the mean time to system loss. 

For given subsystem parameters, these formulas are used to 
compute the subsystem invulnerability. For a specified subsystem 
loss probability (or mean lifetime), one uses these results to 
calculate the desired level of subsystem redundancy and underlying 
unit and data bus failure rates. ' 


Jj.nCt 


oin 


II. 6 FAILURE ANALYSIS FOR THE DATA PROCESSING NETWORK 

II. 6.1 Reliability Performance Measures for the Data Processing 
Network 

The Space Shuttle orbiter avionics data processing network 
consists of serial data buses which connect the application sub- 
systems to the computer complex. The data buses are divided into 
groups. Different groups provide coimnuni cation connections to 
different subsystems. Certain subsystems contain redundant units, 
each connected to a different data bus, to increase the subsystem 
invulnerability to failure. 

Reliability measures for the computer system have been 
presented in Section 1 1. 2. The associated failure analysis for 
the computer system is carried out in sections II.3-II.4. Failure 
analysis for an application subsystem is presented in Section II. 5. 

t 

In this section we wish to combine these results with the failure 
characteristics of the data communication network. 

The topological structure of the data bus network is specified 
by the incidence matrix B, where 


and 


B = [b..] 



1, if data bus j connects unit i to the 
computer complex 

0, otherwise. 


(II. 6. 1-1 ) 


Each subsystem contains a number of units. We can thus 
describe the topological interconnections between the subsystems 
and the computer complex by a subsystem incidence matrix A, where 


A = [a,j] 


and 
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= 


i 1. if 
I 0, ot 


subsystem i is connected to data bus j 
otherwise 


The overall network topological structure is specified by the 
connectivity matrix, also called adjacency matrix, C where 


C = [c..] 

^ ij-' 


(II. 6. 1-3) 


and 


( 1, if 
'ij " \ 0, ot 


node i is connected to node j 
otherwise 


We regard each network element (GPC, application subsystem or 

unit) as a node. Nodes are connected by the data bus lines, 

inducing thus an underlying topological structure modelled as a graph. 

Wehn the computer system is in the redundant mode, four GPCs 

1 

are connected in parallel, having simultaneous access to all applica- 
tion subsystems. We then have 

No = ’• 

for each unit i and GPC j. 

When the computer system is in simplex mode, each subsystem 
(task) is associated, on a dedicated basis, with a certain computer. 
Then, 

a. . = 1 

whenever subsystem i is associated with GPC j, and a.. = 0 otherwise. 

' s) 

We wish to examine the invulnerability of the data processing 
network to failures of nodes and lines. To assess network reliability, 
the fo-1 owing performance measures are of interest. 
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We incorporate, as element failures, the failures of computers, 
data bus lines and subsystem units. 

In the redundant mode of operation, we say that a network loss 
event has occurred whenever a certain set of tasks cannot be 
processed by the computer complex. This can be due to computer 
failures, line failures (or noise), or failures of units in certain 
application subsystems. 

The probability of network loss in T units of time is set to 
be 

" probability of network loss is T units of time. (.116.1-4) 

To define and compute we identify a set of critical subsystems 

(or tasks), the failure of each of which induces a system loss event. 

We thus set 

Nj, = set of critical subsystems in the redundant mode. (lT.6.1-5) 

Subsequently, the network loss probability in the redundant mode is 
defined as 

Pj^l(^) ~ probability that, under the redundant mode, a critical 
subsystem cannot be utilized, or connected to the 
computer complex, or receive information-processing 
service from the computer system. (II.6.1-6) 

Clearly, in computing Pj^^^CT) we need to consider the availability 
of computer processing resources to serve the critical subsystem, 
the reliable transmission of information between the computer complex 
and the critical subsystems, and the operational integrity of the 
critical subsystems themselves. We also incorporate the possibility 
of rerouting upon certain line failures. 
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In a similar manner, we define the mean time to network loss as 


^NL ~ network loss, under redundant mode 

= mean time until the failure of a critical system, or {II. 6.1-7) 
its network disconnection, or the non-availability of 
computer resources for its associated processing services. 


Under a simplex mode of operation, we consider the subnetwork 
composed of a single GPC and its associated application subsystems. 
The probability of network loss is then similarly defined as 


qjjjl^(T) = probability that a critical subsystem cannot be 
connected to a GPC in T units of time, under the 
simplex mode (U.6.7-8) 

In computing we consider GPC failures, line failures and 

unit failures, as before. In addition, we also incorporate the possibil- 
ities of rerouting messages (through alternate paths, when their 
primary paths fail). Also, we consider the utilization of a stand-by 
GPC to replace a failed computer. 

In a similar manner, the mean time to network loss under 
simple mode is defined by 

network loss, under simplex mode. (II. 6. 1-9) 

In assessing the interconnecting communication data bus network 
itself, the following connectivity measures are useful: 


K(i) = minimal number of line failures which cause subsystem 

i to be disconnected. (11.6.1-10) 


P|^(i) = probability that subsystem i is disconnected. 


( 11 . 6 . 1 - 11 ) 


For time-critical tasks, it is also of interest to define the 
delay dependent reliability measure 
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P|^(i>D) = probability that a task associated with subsystem i 
cannot be processed by a 6PC within D units of time, 


(II. 6. 1-1 2) 

In computing (1.6.1-12), we note that it is possible that the 
subsystem will remain connected to the computer complex, after certain 
failures, but due to increased traffic (caused, for example, by 
rerouting tasks away from failed lines or GPCs), associated tasks 
cannot receive service (processing) within their required critical 
time delay constraints. 

1 1 . 6 . 2 Failure Analysis for the Data Processing Network: The 

Redundant Mode 

The computer system is assumed to be in the redundant mode. The 
computer failure rate is x [failures/sec]. The computer coverage 
probability (i.e., the probability that a GPC will detect its 
failure, when it has failed, using self- test procedures) is equal 
to Pj. Then, if N GPCs operate in parallel, the probability of a 
computer system loss in T sec, P2|_{T,N), is given by Eq. (II. 4-24) 
as 

= n-e ^'^^(N-l-NP^)] . (II. 6. 2-1) 

The mean time to failure for the computer system is given by (II.4-34) 
to be equal to 

Tp(N) = X-’( ^ i-’ + p) . (II.6.2-2) 

' i=3 ' 

In particular, when N=4, we obtain 

-XT -X.T 

Psl(T) = Psl(T, 4) = (1-e ^ )-^[l+e (3-4Pj)] ; (II. 6. 2-3) 

Tf = Tp(4) = x'l (Pj, + If ) . (II. 6.2-4) 

Considering now an application subsystem, its failure analysis 
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has been presented 1n Section II. 5. Assume subsystem i to contain 
redundant units. Assume each unit to be connected to a single 
data bus, which is in turn connected to the GPC complex ( and 
thus to all GPCs in the redundant mode). The failure rate of a 
unit which belongs to subsystem i is set equal to [failures/sec]. 

The data bus line failure rate is equal to [fai lures/sec], 
for each line. Line failures are assumed to be statistically 


independent. Time to failure of a data bus line is taken to be 
governed by an exponential distribution with mean . Then, by Eq. 
(II. 5-15} we find that the probability of subsystem i loss, denoted 
as q^^^(T), indicating the probability that subsystem i will fail 


or become disconnected within T sec, is given by 


(T) = D-e 




(II. 6. 2-5) 


The mean time to failure of subsystem i is given, according to Eq. 
(II. 5-16), by 




Subsystem i is said to be in a state of network loss if it has 
failed, is disconnected from the uata bus network or if the computer 
system is lost. We set 

P||ll|^(T) = probability that subsystem i is in a state of 

network loss. (II. 6. 2-7) 


Then, combining results (II. 6. 2-1) and (II. 6. 2-5), we obtain 

° {II. 6. 2-8) 

= 1 - {1 - (1-e 
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If we now let 

. (II. 6.2-9) 

so that subsystems are regarded as the critical sub- 

systems, then the network loss probability Pnj[_(T) is given by 

- r-Jf n-q<’»^>(T)] 

k=l 


fj.-i 

•{l-(l-e )'^ '[1+e (N-l-NP^)]}. (11.6,2-10) 

Eq. (II. 6. 2-10) expresses the probability of network survival ^■P|\j|_(T)» 
as the product of the survival probabilities of the critical subsystems 
and the computer system. 

The mean time to network loss is the time to first failure 
of the computer system or any one of the critical system, or its 
disconnection. 

Eq. (II. 6. 2-10) can be used to evaluate the invulnerability of 
the data processing network to failures of the computer system, data 
bus lines and application subsystem units. 

I I . 6 . 3 Network Invulnerability: Alternate Routing and Congestion 

Effects 

The network invulnerability characteristics can be improved by 
providing a lternate routes upon data bus failures. This is demonstrated 
as follows. 

Assume a subsystem with L redundant units. The unit failure 
rate is [failures/sec]. The line failure rate is A^ [failures/sec]. 

The subsystem is associated with K data buses. A switching capability 
is provided so that, upon the failure of its line, a unit can be 
connected to one of the available operational buses associated with 
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subsystem. Thus, initially each one of the L units is connected to a 
data bus. When its line fails, a unit can be connected to one of 
the operational associated lines (including a line that was 
previously connected to another unit which has failed). 

Under such a switching procedure, the probability of subsystem 
loss, denoted as computed as follows. 

qsi(T) = P{L units fail or K lines fail, or both} 

= P{1 units fail}+P{K lines fail) 

- P{L units fail}P{K lines fail} 

= 1 - [l-(l-e “ )‘-][l-0-e *• )'^] . (II. 6. 3-1) 

We note that for K ^ L, 


qSL(T) < • (II. 6. 3-2) 

Thus, by providing K alternate data buses, we have decreased the 
subsystem loss probability. 

Such alternate data buses can be provided to the critical subsystems. 
Providing K.. alternate routes to critical subsystem i., we subsequently 

1 v! J 

obtain the net\-;ork loss probability to be given by (when all routes 
are assumed to be distinct): 

c 


‘^c^xN-l 




{1 - (1-e )'^"'[l+e ^ (N-l-NP^)]} 


where 


L. 

1 - q^L^T) = [i-(l-e ^ ) '][!-(! 


•'A„T K, 


) ■'] 


(II. 6. 3-3) 


(II. 6. 3-4) 


Eq. (11.6,3-3) expresses theprobabi lity T~P[^(_(T) of network survival 
as the product of the survival probabilities of the computer complex, 
critical subsystems and the alternate routes. 

. V) 
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In turn, as buses are switched to serve critical tasks, non- 
critical tasks are delayed. If, however, the number of remaining 
operational data buses is below a certain critical value mg, the 
overall traffic associated with critical tasks is high enough to 
cause an excessively high message delay value Dq. Under such high 
message delays, the network cannot provide satisfactory service 
to the critical tasks, and the network can be said to be lost. This 
loss probabilty is thus defined as 


= probability that the computer system is lost, or 
a critical subsystem is lost, or that the 
communication network can provide no more than 
mQ interconnecting data buses, causing critical 
message delay value higher than Dq . (II. 6. 3-5) 


To compute P|\jl(T)> we model the whole communication network 
topological structure. We assume that the c critical subsystems 
can use commonly m data buses, m ^ c. Thus, upon the failure of line, 
an operational line from the pool of these m lines can be rerouted 
to serve the associated critical subsystem. The subsystems will be 
disconnected from the computer complex if m-c or more data buses 
fail. Therefore, we obtain. 


Probability of disconnection of critical subsystems from 
the computer complex in T units of time 


m 


rr, 0 ‘A„T(m-k) 

•£ (p(l-e ‘ " V ' ■ 

m-c+1 


(11.6.3-6) 


We need however at least mQ buses to survive to limit network 
congestion. Subsequently, the network loss probability PfjL(T) 
is obtained to be given by 
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-XT 


N-1 


-XT 


Pj^jT) = 1 - n-(l-e ^ )''‘'D+e MN-1-NPj)]} 


m Tk ”XaT / 1 \ 

2 (")e * (1-e “ 


k=in 


c L,-. 

TT n-(l-e " ) 

k=l 


0 


(II. 6. 3-7) 


To explain (11.6,3-7), we note that 1-P2|_(T) expresses the probability 

of survival. Then, the first, second and third terms in (1 1. 6. 3-7) 

represent the probabilities of survival for the computer system, 

communication bus network and the critical subsystems, respectively. 

The product of the latter terms yields the probability of network survival. 

Eq. (II. 6. 3-7) can now be used to evaluate the data processing 

invulnerability characteristics, as well as to choose and adjust 

the underlying failure parameters, topological structure and routing 

discipline. In particular, we note that the following parameters 

are involved in computing the network loss probability 

•The computer failure rate (X^); 

•The number of parallel computers (N)i (here (N=4); 

•The computer coverage probability (P^); (here P^=0.96 in 
redundant mode); 

•The duration of operational period under consideration (T); 

•The subsystem unit failure rate (x^^); 

•The number of redundant units in a subsystem (L); 

•The set of critical subsystems (or tasks, i-j ji^i* • • >ij.) i 
•The data-bus line failure rate (x^); 

•The number of data-bus lines commonly used to interconnect the 
critical subsystems with the computer complex (m); 

•The minimal number of data-bus lines required for a satisfactory 
interconnection (involving both reliability and congestion 
performance measures) of the critical subsystem to the computer 

complex niQ. 
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Incorporating all these parameters in Eq. (II. 6. 3-7), we 

A/ 

compute the probability Pj\j|_(T) of network loss within T units of 

times. Alternatively, for a prescribed maximal value of Pf^[_(T )5 

we use Eq. (I I. 6. 3-7) to determine the proper computer, subsystem 

and network (topological) parameters. 

We finally note that the network (deterministic) connectivity 
measures are given as follows. 

K = network index of critical connectivity 

= minimal number of lines whose failure disconnect the 
critical subsystems 

= m-c+1 ; (] 


K(Dq) = network index of critical stable connectivity 

= minimal number of lines whose failure cause message 
delay to increase above Dg sec 

= m-mg+1 (j 

The associated probabilistic connectivity measure is given by 

A/ 

and expressed by Eq. (II.6.3-7). 

II. 6. 4 Failure Analysis for the Data Processing Network: The 

Simplex Mode 

Under the simplex mode of operation, tasks and subsystems are 
divided between two GPCs. The remaining GPCs can then serve as 
stand-by units. 

To characterize system invulnerabil ity to failures of GPCs, 
data-bus lines and application subsystems, we compute the network 
loss probability <l|\jj_(T), defined by Eq. (II. 6. 1-8), This function 
expresses the probability that a critical subsystem under 
consideration cannot be connected to a GPC, within T units of 
operational time, under the simplex mode. For that purpose, the 
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following network structure and parameters are specified. 

a) The computer failure rate is equal to [failures/sec]. 

b) Two computers need to be in operation. Three computers 
are initially in a stand-by mode. Upon the failure of 

a computer, a stand-by GPC is imuediately used to replace 
it, if any operational stand-by computer is available. 

The computer system is said to be in a state of system 
loss if there are not two operational GPCs. 

c) The computer coverage probability (of failure detection 
by self-test methods) is equal to P^. 

d) The data-bus line failure rate is equal to X [failures/sec]. 

jLf 

e) The subsystem under consideration contains L redundant 
units. The unit failure rate is equal to x^ [failures/sec]. 

f) The subsystem under consideration can use lines taken 
from a set of m data bus lines. It requires, however, 
a minimum of mg lines, 1 l mg £ m, from this set of m 
lines, to be able to conduct its information-proce-sing 
tasks in a satisfactory manner. 

g) As an alternative topological model, replacing (f), it can 
be assumed that the m data-bus lines are shared by m| 
subsystem (or tasks). Each subsystem requires at least 

a single (distinct) data bus line for its connection to 
a GPC. 

We use the above mentioned system conditions and parameters to 
evaluate the network loss probability <1[^|_(T)‘ We start by using the 
study results concerning the failure of the simplex computer system. 


as presented in Section II. 3. By equation (II. 3.2-4), the probability 
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qsid) that the simplex computer system will fail in T units of time, 
when initial 5 GPCs are available, two GPCs are operating simultaneously, 
and computer system failure is declared when at least four GPCs have 


failed, is given by 


n=0 


3 -2VT (2U)" 


= 1-e 


-2^.T 


[1 + aj + 1(2 aJ)^ + j(2yT)^] • 


The probability that the application subsystem under consideration 
will fail, denoted as q^(T), is obtained by recognizing the latter 
to fail if and only if all the associated units fail. Therefore, 


we have 




q;^(T) = D-e “ ] 


Under assumption (f), the interconnecting data-bus network can 
serve the underlying subsystem as long as it has ihq, out of m, 
operating data-bus lines. Therefore, the probability -|(T) that 
the associated interconnecting data-bus network fails, under condition 
(t)» is given by 

m ... -Xj -A T(m-k) 

qLj(T) = ^ . (II. 6. 

k=m-nig+l 

Subsequently, the probability T-q^ ^(T) that the interconnecting 
network survives in T units of time (i.e., that it provides a 
connection between the underlying subsystem and the GPC) is equal to 

-A„T . -Aj(m-k) 

1 - q^.itT) = E On-. ^ e * 

= . (II. 6 
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Combining these expressions, we obtain the probability 
of network loss, under the simplex mode, in T units of time, by 
writing 


l-qj^L^T) = Cl-qsL(T)][l-q^(T)][l-qL^^(T)]- 


(II 


Eq. (II. 6. 4-5) expresses the probability l"qf^l_(T) of network 
survival as the product of the survival probabilities of the 
simple computer system, the underlying subsystem and the inter- 
connecting data-bus network. Subsequently, substituting (II.6.4-1)- 
(II. 6. 4-4) into (II. 6. 4-5), the network loss probability q|\jL(T^) 

is obtained to be given by the following formula: 

-2x T 

q„L(T) = 1 - {1‘. [U2XJ+2(XJ)^+ f(xj)^]} 


~^u"^ L 

n - [l-e r> 


m 


_ “A..Tk m b 

2 (”)e (l-e 


k=m 


0 


(i: 


Using Eq. (I 1.6. 4-5) we can evaluate the network invulnerability to 
GPC, data-buses and subsystem units, under the simplex mode of 
operation. 

In deriving Eq. (II. 6. 4-6) we have assumed that the underlying 
subsystem can employ rerouting procedures in utilizing any one of 
the operating lines, out of initially available m operating data-bus 
lines, as long as no less than mQ data-bus lines are in operation. 

Alternatively, to model the sharing of the pool of data bus 
lines by a number of subsystems, we now assume conditions (g) to 
hold. Then, m^ subsystems share the utilization of m data-bus 
lines. Note, however, that only a single subsystem is allowed 
to use a certain operational data-bus at one time. (Thus, no time 
simultaneous use of a data bus by several subsystems is considered. 
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Each subsystem requires at least a single (distinct) data bus line 
for its connection to a GPC. Now, the probability failure 

of the data-bus network, is relative to the subsystem under consideration, 
is computed as follows. 

The data-bus network cannot interconnect the subsystem under 
consideration if and only if at a certain time, prior to T, the 
line connected to the subsystem fails, and the number of operational 
lines then is smaller than nii (so that all operational lines are 
occupied). We set 


f(u)du = P{m-m-j-th line failure occurs in (u,u+du)}. 


(II. 6.4-7) 


Since, until time u line iirterfailure times are i.i.d. exponentially 
distributed with mean (m-|A^) \ we find f(u) to be the Gamma density 


f(u) = 


Im-m-j-V) ! 


m-iiu-1 -miA^u 


u > 0 . 


(II. 6. 4-8) 


The probability 2 ^'^) bus-network loss, relative to the 
underlying subsystem, is subsequently given by 

fT -A,(T-u) 

qi_ 2 (T) = J f(u)[l-e * ]du . (II. 6 . 4 - 9 ) 

0 

Eq. (II. 6. 4-9) indicates that a bus network loss event will occur 
if, at some time u, only m^ lines (out of initial m lines) are left, 
and in the following T-u units of time the line connecting the subsystem 
under consideration fails. Substituting (II. 6. 4-8) in (II. 6.4-9) we conclude 
the result 


n ITI - r 

Jq ' 


(ma.u) 


m-iiiT-1 -nuAsU -A. (T-u) 

' e ' ^ [1-e ]du 
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As before, the computer system loss probability q 5 |_{T) and 
the subsystem loss probability are given by Eqs. (II.6.4-1) and 
(11,6.4-2), respectively. Also the network probability of survival 
is expressed in accordance with formula (II. 6.4-5). We subsequently 
conclude that the network loss probability udner condition (g), 
for the simplex mode, denoted as <1|^|_(T)» is given by 

-2a T 

"" [1+2aJ+2(> J)2 + 

"^c^ L 

{1 - [1-e ] }{l-qL^2^’^^^ ’ (II. 6. 4-11) 

where q^ 2 ^^) given by Eq. (II. 6. 4-10). 

The mean time to failure of the interconnecting network, relative 
to the subsystem under considerationis now given by 


_ m-m-j 1 

%.2 = 



(II. 6. 4-12) 


In the same manner we derive the formula for the network loss 
probability when it is assumed that different subsystems (tasks) 
can share certain data buses on a time division multiplexing (TDM) 
basis. Then, if we assume that a single data bus can be time- 
shared among my subsystems, (tasks), the following results are obtained. 

Under conditions (g), with TDM lines, the data-bus network 
would not be able to interconnect the subsystem under consideration 
if and only if at a certain time, prior to T, the line connected 
to this subsystem fails, and the number of operational lines is 
smaller than [m-j/my]; the latter denoting the smallest integer not 
smaller than m-i/niy. Therefore, q^^ 2 ( 1 ”) is given by Eq. (II. 6.4-10) 
with m-j there replaced by [m-j/niy]. The network loss probability is 
subsequently given by Eq. (II. 6. 4-11) with q. ^(T) expressed as 
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